ALGEBRAS WHICH DO NOT POSSESS A FINITE BASIS’ 


BY 


J. H. M. WEDDERBURN 


1. Introduction. The object of this paper is to classify algebras which 
do not have a finite basis. The methods used are similar to those employed 
in a former paper,t but considerable difficulty was experienced in extending 
the results of this paper, as the proofs of many of the principal theorems 
depended on the use of induction and were therefore tied up with the 
finiteness of the basis; and, in fact, these difficulties have been only par- 
tially overcome, as is shown by the postulates assumed in § 5. It is hoped, 
however, that, in spite of their incompleteness, the results presented here 
will be found of sufficient interest to justify their publication. 

It is noteworthy how little place the finiteness of the basis—or indeed 
the presence of any basis at all—has in the principal theorems of linear 
algebras. The first theorem of importance in which it seems to be required 
is that in which it is shown that primitive idempotent elements exist in 
an algebra which possesses elements of finite rank that are not nilpotent; 
and in two other cases it has not been found possible to complete the 
argument when a finite basis is not assumed, namely, in the theorems which 
state that, if an algebra is not nilpotent, it contains an idempotent element, 
and that the maximal nilpotent invariant subalgebra can be separated from 
the rest of the algebra. 

The proofs of many of the theorems parallel those for the case of algebras 
with a finite basis very closely, so closely in fact that it might have been 
sufficient to refer the reader to previous treatments of the subject. It has 
been thought advisable, however, to repeat most of these proofs, as other- 
wise the reader would feel much uncertainty as to the logical completeness 
of the treatment. In one or two cases reference has been made to the 
paper mentioned above, or to Professor L. E. Dickson’s treatise,i in place 
of giving a detailed proof. 

It was found inconvenient to give in one section all the postulates used 
as in several cases their statement involved some previous discussion. After 
a short discussion in § 2 of algebras defined in the manner used by Hamilton, 


* Presented to the Society, May 3, 1924. 
7Proceedings of the London Mathematical Society, ser. 2, vol. 6 (1907), 
pp. 77-118; this paper is cited hereafter as W. 
tL. E. Dickson, Algebras and their Arithmetics, Chicago. 1923, cited hereafter as D. 
395 28 


396 J. H. M. WEDDERBURN October 


the postulates common to all associative algebras are given in § 3 while 
those peculiar to algebras which do not have a finite basis are given in § 5, 
2. The Hamiltonian definition. Hamilton’s definition* of a linear 
associative algebra may be modified as follows. Let ¢ be a variable which 
runs through a given range or set of values G (which need not be numerical 
although this will generally be the case) and §(¢) a single-valued function 
which is defined for every value of ¢ in G and which has values, for the 
present restricted to be finite, which lie in a given fieldt F. Two such 
functions, §(¢) and y(t), are said to be equal if, and only if, §(t) = y (?) 
for every value of ¢in G. The sum §+, is the ordinary sum in the field FP; 
the product §(¢)><4(t) may not be the ordinary product but is to be 
defined in any particular case subject to the following conditions: 


» 4 
(2.1) ~ 
299 
(2.2) 


We shall also assume that the product of two functions in a given set 
belongs to the same set. If this condition does not hold in a given set, 
as in Grassmann’s caleulus, the set may always be so extended that it has 
this property. 

If x (t,e@) is a function of ¢ and @& in G such that 


(2.31) OFF, 


then 


« +f, 


(2.32) S(a)a(t.«) 
We may therefore set 


(2.4) 


Here the exact meaning of the > is best left somewhat indefinite, a special 
definition being given in any particular case; the properties required of it 
are detailed in the next section and in the meantime it suffices to give two 


See the introduction to his Lectures on Quaternions, Dublin, 1853; also Transactions 

of the Royal [rish Academy, vol. 17 (1835), pp. 293-422, vol. 21 (1843), pp. 199-296. 

+ This may be generalized considerably by taking in place of #' some linear associative 

algebra already defined (so giving the direct product) or even an algebra such as the 
algebra of logic. 


7) 
= 
| 
G (a@) 
\ 


1924] ALGEBRAS WITHOUT A FINITE BASIS 397 


examples. If the elements of the range G form an enumerable set, > denotes 
G 

ordinary summation, of which algebras with a finite basis form a particular 

case, Other examples being given in § 9. Again, if @ is an interval of 

the real continuum, >’ §(a@)a(t,@) may be defined as the Stieltjes integral 


G(a@) 


| d(&(@)a(t,a@)) if §(@) is properly restricted. 


Any set of functions e,(@), @ in G, are said to form a linearly independent 
set in G if every relation of the form 


> (a) 0 
G (@) 
for every ¢ in @ entails §(@) 0 in G@. If, further, every function of 
the algebra can be expressed in the form 


§ (ft) J (a@)ela), 
G 
the set e,(@) is said to form a basis of the algebra, and the cardinal number 
of G is called the order of the algebra. 

When a basis is used to define an algebra, the functional notation is not 
usually convenient. On the analogy of algebras with a finite basis we shall 
generally write (2.4) in the form 

Ela) re. 
G (@) 
The values of §(@) are called the coefficients of x and, if 4 is any constant 
mark of the field #, we shall set 
te 
— D AE (a) 
G 
When this point of view is adopted, the product of «, and 2s will be denoted 
by Vy Lge 

The condition that the product is associative may be stated in much the 
same way as when a finite basis exists. Since we are assuming that 2, 2. 
lies in the given set, we must have 

G (t) 


Where /(7,s,¢) is some function defined for +,s,¢ in G, and, if multi- 
plication is associative, we find in the usual manner 


G (a) (@) 


28* 


398 J. H. M. WEDDERBURN [October 
If the units used are the Hamiltonian ones described above, equation (2.5) 
may be written 

(2.62) ty (a) < (a) = k(r,8,0@), 

or, for any functions § and y of the set 


(2.71) > E(r) (a) < 8,0). 


It is then clear that we may take as a particular interpretation of > (or 
as our definition of the product of two functions) 


(2.72) =< (a) [few n(s)k (r,s. a@)drds 


with 


(2.73) r,s, t)k(t, B,a)dt = fro, t,a)k(s,B,t)dt 


as the condition of associativity.* 

Examples of functions k(7,s,¢) which satisfy (2.73) are easily constructed 
by employing orthogonal functions in conjunction with the constants of an 
algebra with an enumerable basis. The following illustrations are constructed 
from the constants of ordinary complex numbers and quaternions: 


(2.81) k(r,s,t) = Dkmsinm(r+s—t); 
mm 
P 
(2.82) A(r.s,t) = > km 
m 
— 
+ [sin(2m+1)rsin(2m+2)s 


+ sin(2m-+ 2)rsin(2m + 1)s] sin(2m + 2) 


(2.83) k(r,s,t) = Dkm{sin(2m+1)(r+s—t) 
+ sin[(2m+1)r+(2m+2)(s—t)] 
— 


+ sin [(2m + 2)r—(2m+2)s—(2m-+ 


* li will often be necessary here to replace ordinary integration by one of its many 
generalizations and suitable restrictions must, of course, be placed on the functions involved. 


| 
i 
t 
p 


1924] ALGEBRAS WITHOUT A FINITE BASIS 399 


Here the constants k,, are only restricted by considerations of convergence 
ind integrability, and the range of each of the variables is over a period 27. 
It is clear that, subject to these conditions, (2.81) represents any odd periodic 
function of r-+s—t with period 27, so that the operation 


defines an algebra in the range —1 to +1 when /(7) is an odd periodic 
function with the period 2. 

3. Fundamental postulates and definitions. We shall now give a 
more abstract definition by means of postulates without attempting, however, 
to make these independent, the aim being descriptive rather than analytical. 

A linear associative algebra A is a set of two or more elements a, b, c, --- 
subject to two operations, namely, addition, which will denoted by +, and 
multiplication, which will be denoted simply by juxtaposition of the factors. 
These operations are subject to the following conditions. 

POSTULATE 3.11. 


Ay: a+b is an element of A; 

a+b = b+a; 

a+(b+ec) (a+b)+e; 
Ay: There is an element 0 such that a+ 0 = a for every element a of A; 
As: For every element a there exists an element b such that a+b = 0; 
M,: ab is an element of <A; 
M.: a-be = ab-c; 
AM: a(b+c) ab+ac, (b+c)a = ba+ea. 


It is easily seen that O is unique in A, and that ) in A; is uniquely 
determined when a is given; b is denoted by —a and —(—a) = a. 

If a is any element, a-+ a is denoted by 2a and in general a+a+---+a 
(m terms) is written ma; evidently (ma) b = a(mb) = m(ab). When 
a+ 0, it is not difficult to show that the smallest integer for which ma = 0, 
if such an integer exists, is always a prime; we shall assume that this 
prime, if it exists, is the same for all elements since, when this is not the 
case, the algebra is reducible. When no such integer exists, we assume 
the following postulates. 

POSTULATE 3.12. Jf a is an element of A different from 0 and m is a 
positive integer, there exists an element b such that a = mb. 

The element b, which is unique, will be denoted by (1/m)a. This 
postulate is sufficient for many purposes, but the following one, which in- 
cludes it, will usually be more convenient. 


400 J. H. M. WEDDERBURN | October 


PosTULATE 3.13. (i) There is associated with A a field F’ such that to any 
non-zero element a of A there is allied a subset Ag of A which is im (1, 1)- 
correspondence with the elements of F, a and 0 in Aq corresponding respectively 
to 1 and 0 in F; the element corresponding to a mark § of F is denoted 
by &a. The correspondence is preserved under the operation of addition, that 
is, Sa &)a; 

(ii) If b 0 is any element of Aa, then Ay = Aa and, if §2b = §,a (82 
then a = (& sD") b: 

(iii) (S,a+ 8b) 


(iv) 


This composite postulate is broadly equivalent to saying that the elements 
of A correspond to an affine geometry in which these elements are the 
points of the geometry. or to a projective geometry in which the sets A, 
correspond to points. 

The combined postulates 3.11, 12 and 13 will be referred to as Postulate 3.1. 
In these postulates we have considered combinations of elements by a finite 
number only of applications of the fundamental operations. Later we shall 
see that infinite sums are required in certain cases which are introduced 
by postulates as required. We have, however, to frame our definitions from 
the start so as to admit the possibility of such combinations of an infinite 
number of terms and it is therefore necessary to detail the properties 
required of the summation sign >. For the present we shall merely say 
that, if a is a set of elements of A in (1, 1)-correspondence with a range 
or set of values of a variable ¢, then for certain ranges—which depend 
on the particular algebra under discussion—there exist elements denoted by 


> Sat, where & is a function of ¢ in G whose values lie in F, and the 
G 
summation sign >’ has the following properties: 

(a) If G contains only a finite number of elements 1, 2,---, nv, > & 2% 
denotes §, 2; + +---+ 

(b) Whether G@ is finite or not, if y, is also a set of elements of A defined 
for ¢ in G and z is any element of A, then 


(Se xe + yt) x1 +> Nt Yt; 


= Diem, = D 


7 G 


provided always that in each case the summations used have a meaning. 


(0). 


1924] ALGEBRAS WITHOUT A FINITE BASIS 401 


When » & x exists in A, we shall say that it is linearly dependent on 
G 


ve, t in G. 

A complex* in A is any subset of A which is closed under the operation 
of addition but not necessarily under that of multiplication. If B and ( 
are complexes such that every element of B is an element of (' and vice 
versa, we write B = C; if B contains all the elements of (' and also 
elements not in C, we write B>C' or C-— B&B. If the order of a complex 
is 1, that is, if when b is any non-zero element of B all elements of it 
have the form £b,£ an element of F, we shall write B (b) or, when 
there is no risk of confusion, B = bh; thus «< B means that . is an element 
of B. The intersection of two complexes B and C is the complex of all 
elements common to both; it is denoted by B-~C. 

[if B and C' are two complexes, the complex of all elements which can 
be derived from the elements of B and (’ by means of the operation of 
addition is called the sum of B and (' and is written B+-C. Evidently 
the addition so defined is commutative and associative. Similarly, if B;. ¢ 
in a range G, is a set of complexes, > Be is the complex of all elements 


derivable from the totality of elements in the B; by means of the operation 
of addition. Even if G is an infinite range, this does not necessarily involve 
infinite sums of elements. 

If x and y are variable elements of B and C' respectively, the totality 
of elements of the form xy together with those elements derivable from 
them by the operation of addition is called the product of B into C and 
is written BC. The multiplication so defined is associative and distributive. 
We may also note here that 


A-(B-C) = (A-B)-C, 


A(B-C)sAB-AC, (BAC)A< BA-CA. 


If C is a subcomplex of a complex B, any two elements z,,z, of B 
for which (7,—22)<<C are said to be congruent modulo C and we write 
% = az (mod C); all elements congruent to x, modulo C, that is, all elements 
of the form 2,-+y where y<C, are said to form a classt modulo C. The 
class corresponding to 2, may be written [z,]; it is completely determined 
when any one of its elements is given. The class [0] is the complex C 
itself. 

* This term was introduced into the theory of finite groups by Frobenius in a similar 


sense. Dickson uses the term “linear set’’, Scorza, “linear system”. 
+ Cf. D, p. 80. 


40? J. H. M. WEDDERBURN | October 


Two complexes D and F are said to be congruent modulo (' if there is 
a (1,1)-correspondence between their elements such that, if 2 and y are 
corresponding elements of C' and D respectively, then 2—y—C. 

If a complex B has a finite basis, it is clear that, when C is a proper 
subcomplex of 2, there exists a complex D such that B= C+ D, C~D= 0; 
but here, as we do not assume the existence of any basis, it is necessary 
to have the following postulate: 

PostuLaTE 3.2. If C is a subcomplex of a complex B, there exists a complex D 
which has no element in common with C and for which B C+D. 

The complex D is called a supplement of C in B. 

A tew definitions are conveniently given here. Any element different 
from 0 which is equal to its own square is said to be ¢dempotent. If e is 
idempotent and « an element of 4, then, if there exists an element y such 
that xy = ¢ (yx = ¢), x is said to have a right (left) inverse with respect 
to e; if 2 has neither a right nor a left inverse with respect to e, it is 
said to be singular with respect toe. It may be noticed here that, if xy = e 
is idempotent, then yex is also idempotent. If there is an element m such 
that ma = a = am for every element of A, it is called the modulus of A; 
it is evidently unique. If some power of an element is 0, ‘it is said to be 
nilpotent, and, if a” is the lowest power of 2 which is 0, » is called the 
index of x. 

If there exist in a complex B a set of elements 2, ¢ in a range G, 
such that (i) every element of B has the form p> E:at, the &’s being elements 


of the field, and (ii) } &:a; = O if, and only if, § — 0 for every ¢ in G, 


then the set 2 is called a basis ot B. The cardinal number of the set G 
is called the order of the basis and, if this number is unique, it is called 
the order of the complex. The existence of a basis is not assumed in this 
paper, but all the examples constructed so far possess one. 

The algebra generated by the elements 2, 72, 73, --- is denoted by 
{21, v2, --+}; the order of {a} is called the rank of z. 

Integral powers of a complex are defined in the usual manner; thus 
Bu = B. The condition that a complex is an algebra 
then takes the form B* ~ B&B. A phenomenon occurs here in the case of 
algebras which do not have a finite basis which is not present when the 
order is finite. Let A be the algebra generated by a, b, ¢ where 


ab ba = 
If A, = {a}, As (bf, then 


A A, + A, + (c), A, Ay = (c), 


1.4, = Ae = cA, = cA, = Ape = 0, 


1924| ALGEBRAS WITHOUT A FINITE BASIS 


and hence 


A” + 42 +(e). 


The complex C (c) is therefore common to all integral powers of .4 
and it is clearly the only such complex; we therefore write 


A’ = ta 


a 
Since (? 0, we have A®%** 0. Similarly in the algebra defined by 


we have Band A® be 0. 

The smallest ordinal number v for which ( 4” )* A” is called the tndex* 
of A. For instance, in the second example given above the index of 4A 
2 and that of Bis . If A” 0, as in this example, A is said to 
be nzlpotent. 

4. Invariant subalgebras. A complex B in an algebra A such that 
AB< B, BA < B is itself an algebra, and it is said to be an invariant 
subalgebra of A. The first two theorems regarding such subalgebras are 
proved in exactly the same way as when there is a finite basis and hence 
they are merely stated here. 

THEOREM 4.1. Jf B is a proper invariant subalgebra of an algebra A, 
un algebra can be derived from A by regarding as equivalent those elements 
of A which differ only by an element of BY. 

This algebra is called the difference algebra of A and B and is denoted 
by A—B. To any algebraic identity in A— 8B there corresponds a con- 
eruence in A modulo B. 

THEOREM 4.2. Jf B, and Bs are proper imvariant subalgebras of A, and 


is @ 


B, > By, then A— By has an invariant subalgebra which is simply isomorphic 
with B,— Bs, and conversely 4 


* When » is finite, it is easily shown that this definition is equivalent to saying that » 
is the smallest integer for which A” 

T Cf. W, p. 82; D, p. 39. 

Of. D, p. 41. 


403 
Ay 1a]. ls B Le, dj, 


404 J. H. M. WEDDERBURN (October 


An algebra which has no proper invariant subalgebra is said to be 
simple. lf B is a maximal invariant subalgebra of A, A—B is simple 
and conversely. 

An algebra is said to be the direct sum of two proper subalgebras 
A,, Ag if 
(4.1) > Ay de, A, Ae 0 As A,, Ay Ae 0: 
and, when such a form for A exists, it is said to be reducible. When (4.1) 
holds, we shall write 4 = A, ® Az in place of A,-+ As when it is desired 
to indicate that A is reducible; the component parts of the sum will be 
referred to as reduced parts of A. A reduced part is evidently an invariant 
subalgebra. 

THEOREM 4.3. Jf an algebra A has a proper invariant subalgebra B that 
possesses a modulus, A is reducible. 

Let C’ be a supplement of PB in A so that 


A= B+C", BAC’ 0: 

also let e, be the modulus of B and y’ a variable element of A. If we set 
(4.2) 

then, as y’ varies in A, y evidently traces out a complex C in A. This 
complex is congruent to C’ modulo B since, B being invariant, e, y’, ye, ey’ e: 
are elements of B, and y = 0 if, and only if, y’— B; hence A B+, 
BAC=0. 

If a is any element of B, 

since xe, x; therefore BC - 0, and similarly CB 0. Since B has 
a modulus, this shows that ( is an algebra, which also follows from 


eC = 0 = Ce, so that, if y and y, are any elements of (, 


YI = YYW AY 


by (4.2). The theorem then follows trom the definition of reducibility. 
CoROLLARY. The algebra C is unique. 
For, if 4 = B@®D, every element x of D has the form y+z, where 
y¥<B, z<D and ax aytaze = y since B= B. Hence 
every element of D belongs to C and conversely. 


1924] ALGEBRAS WITHOUT A FINITE BASIS 405 


As a converse to the preceding theorem we have the following 

THEOREM 4.4. Jf A = B@®C has a modulus, so have also B and C. 

For, if e is the modulus of A, we have e e; +e where e, — Bee <, 
and if «-— B, then 


since Bac = 0. 

The following two minor theorems are occasionally useful. 

THEOREM 4.5, Jf A, and As are algebras such that A, As 
and if either A, or Ag has a modulus, then A, ~ Ag 0. 

For, if A, has a modulus e, and B A, ~ As, then e, B B sinee 
B<A,, and e, B 0 since B< As; hence B 0. 

THEOREM 4.6. Jf A A, ® B, A, ® Bz, and if A, and Az ure irreduc- 
thle and each has a modulus, then either A; do, or A, Ag 0 Ag A; 
= A, Ag. 

Let e, be the modulus of A, and e that of ds. Sinee A, and A, are 
invariant and each has a modulus, it follows that 


Hence ee, — A; ~ Ay and is consequently its modulus. But by Theorem 4.3, 
.1;, being irreducible, cannot have a proper invariant subalgebra with a 
modulus; hence either A, wy or Ay ~ Ag 0. 

5. Idempotent elements. The theory of idempotent elements is some- 
what more elusive in the case of infinite algebras than in that of algebras 
with a finite basis; and certain difficulties arise which it has not proved 
possible, so far, to overcome except by restricting the class of algebra 
considered by further postulates. 

If e is an idempotent element of an algebra A, it is the modulus of eAe; 
und, if ede contains no idempotent element besides e, the latter is said to 
be primitive. When e¢ is not primitive, ede then contains at least one 
idempotent element e’, which is necessarily commutative with e so that 
” = e—ve' is also idempotent and e’e” — 0 = e”e’. More generally, if 
és and e are commutative idempotent elements, then ese, @¢s—éset, et —€s& 
are all idempotent (unless one of them is 0), the product of any two of 
them is 0, and the complex formed from them contains both es and e. 
Further, since ese: is contained in both e,Aes; and Ae, it follows that 
ese = O when e, and e are primitive; hence in a set of primitive idem- 
potent elements which are commutative with each other the product of any 
two is necessarily zero. 


A, As Ay Aa As Ay. 


406 J. H. M. WEDDERBURN ‘October 


When an algebra A has a finite basis, it is readily proved that primitive 
idempotent elements exist whenever there is some element in A which is not 
nilpotent. Our postulates for infinite algebras, however, are not sufficiently 
strong to enable us to draw similar conclusions as is seen from example 9.7, 
in which it can be shown that no idempotent element exists except when 
certain infinite series of elements are admitted as elements of A. We 
therefore assume the following postulate. 

POSTULATE 5.1. An algebra which contains an idempotent element possesses 


at least one primitive idempotent element. 

Let us suppose that A contains a primitive idempotent element e,, and 
let e, ¢ in a range G, be the set of all primitive idempotent elements which 
are commutative with e, and with each other; it then follows as above 
that e-es == 0 when r +s. Such a set is called a complete primitive com- 
plementary set, and e = > et, if it exists, is called a principal idempotent 


of A. Any set of idempotent elements ¢, primitive or not, for which 
eyes = 0(r +s), will be called a complementary sct. 
If e is any idempotent element and sz a variable element of 4, then 


where 


toe = 


As « runs through the elements of A, the elements x» evidently form a 
complex which we shall denote by Ago = >» a, and we have similarly 
the complexes 


Ayo > Ay 2 “a = 
& 


These complexes are obviously supplementary and 
(5.1) Ago + Aso + dor + - 


This is called the Peirce decomposition of A relatively to e; Ago + Ajo is 
the complex of all elements y of A for which ye = 0, Ago + Ap; is the 
complex of elements for which ey — 0, and Ago the complex for which 
ey = 0 = ye.* 

Before extending this decomposition to the case where a complete com- 


plementary set replaces e, we require the following postulates. 


*It is sometimes convenient to note that, if B = > (ex —xe), then Ao, = Be and 
Ao = eB. 


1924] ALGEBRAS WITHOUT A FINITE BASIS 407 


POSTULATE 5.2. Jf e¢, t in G, is a complementary set of idempotent elements, 
the element e = et exists in A and, if x is any element of A, then 


> ax. mre > rer, exe = 7, 


G G G G 


PosTuLATE 5.3. If a;,t in G. is a set of elements of A such that x = > a; 
exists, then 
G G 
for every element y of A. 
Let e,¢ in G, be a complementary set of idempotent elements, and put 


e == Der in (5.1). In view of the postulates just given we may set 
G 


G G 


or, if 
> (ar— exe). Aog Aste = es Aber, 


then 


Aoo Ato + & + Ast. 


where the intersection of any two complexes is zero. 

If the set e& is a complete primitive complementary set, ¢ is a principal 
idempotent element of A, and Apo then contains no idempotent element, 
as otherwise e; would not be a complete set. If A has a modulus, it 
equals e and Ago, Ayo and Ap, are 0. 

Before proceeding further we must consider more closely the nature of 
individual elements of A. When a finite basis exists, either for A itself 
or for some subalgebra which is not nilpotent, then A contains an idem- 
potent element; but, when no such basis exists, the usual proofs break 
down. A closely related theorem is that, if e is the only idempotent element 
in A, every element which has no inverse with respect to e is nilpotent, 
and that the totality of such elements forms a nilpotent invariant subalgebra. 
The proof of this theorem, in one method of attack at least, leads to an 
equation of the form 


where neither « nor y has an inverse with respect to the primitive 
idempotent element e. If » is a nilpotent element of index » for which 
er = x = ze, then 


=e, 


= 


408 J. H. M. WEDDERBURN |October 


which is impossible since we have assumed that » has no inverse. If 
is not nilpotent, or if » is not finite, this method of proof requires the 
existence of the infinite series 


as an element of A. But, if we assume that this element does exist, 
certain difficulties arise. If z e—z, we should naturally expect that 
the algebras generated by » and z respectively would be simply isomorphic 
since the elements of their bases have the same law of combination: but 
in spite of this isomorphism we cannot assume the existence of 


etetett..., 


since then wa = w (e—z) e, Whereas » has no inverse; and, more- 
over, the element w, although an element of {e, x}, cannot be expressed 
in terms of the basis e, 7, z*, ---. at least with finite coefficients. The 
nature of an element a, therefore, cannot be predicted from the laws of 
combination of {x} alone, but the relation of x to other elements of 4 
must be known also. Another example of this is given in the algebra 
(cf. examples 9.2, 9.8) whose basis is 


und in which it is assumed that every element of the form >~,, §,2* exists 
for finite values of n. In this algebra the subalgebra of all elements of 
the form >¥ &,2* does not seem to be distinguishable from the one dis- 
cussed above, although in the complete algebra x has the inverse «~'. 

Instead of attempting to resolve these difficulties by a discussion of the 
nature of a basis in general, we shall be content for the present to intro- 
duce postulates which would most probably appear as theorems if a different 
mode of attack on the problem were used. 

PosTULATE 5.4. If e is a primitive idempotent clement of A, « an element 
of eAe which does not have an inverse with respect to ce, and y e— 7, 
then either 

or 
exists as an element of A. 

It follows from this postulate that y has an inverse with respect to e, 
namely z = e+a+2*?+.--.. For if z exists, evidently yz = e; and, 
if z does not exist, by our postulate w e+y+7?+.-- exists, which 
is impossible as it would then be an inverse of a with respect to e. 


—2 yo). 


1924} ALGEBRAS WITHOUT A FINITE BASIS 409 


An element of an algebra A which does not have an inverse with respect 
to any idempotent element of A will be said to be singular* in A. With 
reference to such elements we have the following theorems. 

THEOREM 5.1. Jf e is a primitive idempotent element. every element x of 
eAe which is singular in eAe is also singular in A. 

If x is not singular in A, there is an element y such that e@, xy is 
idempotent, where, since e, = xye, We may assume ye y. Since 

= gx. it follows that ee, ¢,; also, if és ee, e, then @& = ¢,¢ and 


Hence, since e,~-eAe and e is primitive, either e — 0 ore — e. If 
e, == 0, then e, = 
rye e, contrary to the assumption that x is singular in eAec. The 
theorem is therefore proved. 

THEOREM 5.2. If e is an idempotent element of A. any idempotent element 
which is primitive in eAe is also primitive in A. 

Let e, be an idempotent element of eAe. If e, is not primitive in 4, 
there is a primitive idempotent element ec. for which ¢, és Cs C21. 
from which it follows that 


e = = ee,ee, e, then 


If, therefore, e, is primitive in eAe, it must also be primitive in A. 

We give now another postulate which includes Postulate 5.4 but is here 
stated s@parately as it is not used till Theorem 7.5 is reached. and even 
there it is not strictly speaking necessary. 

PosTULATE 5.5. Jf x is singular in A, every element of the form > &, 3" 
ecists in A, 

To prove Postulate 5.4 on this basis we may proceed as follows. If 

eAe and if there is an element z such that za is idempotent, then 

« — «ze is idempotent and commutative with e; for. since ex = 7, 


ezemze = (a2 )2 p 
* Scorza, Rendiconti del Circolo Matematico di Palermo, vol. 45 (1921), p. 41, 
uses the term “exceptional” in much the same sense; but, as his definition implies either 
a finite basis or that the element is nilpotent, I have thought it necessary to use a 
different term. D, p. 46, calls exceptional elements “properly nilpotent”. 


» » 
e.. 
2 1 2 


410 J. H. M. WEDDERBURN [October 


Since e is primitive, this is impossible unless ¢, e or ¢, = O (in which 
case it is not strictly speaking idempotent). If ¢, = e¢, then ze is an in- 
verse of x with respect to e; if e, = 0, then 


which is impossible since az is idempotent. Hence is either singular or 
has an inverse with respect to e, from which Postulate 5.4 follows im- 
mediately. 

6. Singular invariant subalgebras. A singular subalgebra B of an 
algebra A is defined as a subalgebra no element of which has an inverse 
with respect to any idempotent element of A. If B— <A, then A contains 
no idempotent element and is said to be singular in itself; if, on the other 
hand, A does contain an idempotent element, we shall say that it is non- 
singular. For example, a nilpotent subalgebra is singular in any algebra 
in which it is invariant; or again, the algebra whose basis is a, *, 2°, --. 
is not singular in the algebra ---. 2~', e = 2°, v, «*,--- while it is singular 
in the subalgebra e, x, x*,---. 

A semi-simple algebra is one which is non-singular and which possesses 
no singular invariant subalgebra. 

We shall now show that singular invariant subalgebras have, in the main, 
the properties possessed by nilpotent invariant subalgebras in the case of 
algebras that have a finite basis. 

THEOREM 6.1. Jf an element x is singular in an invariant subalgebra B 
of A, it is also singular in A. 

For, if ry e were idempotent, then e-—/ since «~B and BP is 
invariant. 

THEOREM 6.2. Jf an invariant subalgebra B of an algebra A, contains 
no idempotent element, it is singular; and if it contains an idempotent 
element, it also possesses a primitive idempotent element of A. 

The first part of this theorem follows immediately from the proof of 
Theorem 6.1. If B contains an idempotent element, then by Postulate 5.1 
there is an idempotent element e which is primitive in B. If e is not 
primitive in A, there is a primitive idempotent element e, +c in A such 
that ee, =e. Since e lies in B, which is invariant, it follows that e, < B, 
which is impossible since ¢ is primitive in 2. Henee e must be primitive 
in A as well as in B. 

THEOREM 6.3. If e is an idempotent element of A, any element of eAc 
which is singular in eAe is also singular in A. 

For, if a<eAe and xy = e’, where e’ is idempotent, then ce’ exy 


LY e and rye PPL eA e: also e PP eee 


0 (rz)? re. 


1924) ALGEBRAS WITHOUT A FINITE BASIS 411 


and e” +0 since ee’ = (ee’)® = e’, so that e” is an idempotent element 
of eAe relative to which x has an inverse. 

THEOREM 6.4. The totality of elements which are singular in A form a 
singular invariant subalgebra S of A which contains every singular invariant 
subalgebra of A. 

This theorem is proved as follows. If x is singular, so is also yz; for, 
if it is not singular, there is an element z for which zyz = e is idempotent, 
which contradicts the assumption that x is singular. Hence, if B is the 
totality of elements which are singular in A, Ax and «A are contained 
in B, which is therefore closed under the operation of multiplication and 
has invariantive properties. We have then only to prove that B is closed 
under the operation of addition as it is clear that it will then be a singular 
invariant subalgebra. 

Let a, and a» be elements of B and suppose, if possible, that «, +- 2, 
has an inverse y, say 7,y¥+2#2.y =e where ye =e and e is idempotent; 
since y¥<7,A<B, B, we may obviously assume that 
ty +x =e, en = CX, Xz = re. If e is not primitive, there 
exists a primitive idempotent element ¢, for which ce, — e, = ee, so that 
+ = Wwe may therefore assume that e is primitive. This is, 
however, impossible in view of Postulate 5.4, and hence B is closed under 
addition.* 

THEOREM 6.5. If S is the maximal singular invariant subalgebra of a 
non-singular algebra A, A—S is semi-simple. 

Let 

B+8S, B-~S = 0. 


A—~S is non-singular since we may choose B so as to contain at least one 
of the idempotent elements that exist in A. If, then, A—*S has a proper 
invariant singular subalgebra 7’, there is in B a complex 7, that contains 
no element x for which 7* = x (mod S) and for which 


BT, < T,, T,B = T, (mod 8). 


lt follows that 7, + S is, in A, a proper invariant subalgebra which 
contains S. Since S is maximal, 7; + S must contain an idempotent 


*In W, p. 91, Theorem 15, the statement that the totality B forms an invariant sub- 
algebra was omitted. My attention was called to the need of this addition to the theorem 
by Professor L. E. Dickson in 1914 and the proof given here is essentially the one made 
at that time. When A has a finite basis, Postulate 5.4 is superfluous since the series 
+a?+a%+... terminates when x is nilpotent. A different proof is given by Scorza, 
loc. cit., p. 42. 


412 J. H. M. WEDDERBURN (October 


element e, which we may take to be primitive in view of Theorem 6.2. 
We have therefore e = x+y, where x< 7, and y<S; and x$0, since 
e < S. This gives z* x (mod S), whereas 7 contains no such element; 
the theorem then follows immediately. 

THEOREM 6.6. Jf N is a maximal nilpotent invariant subalgebra of A 
every nilpotent invariant subalgebra of A is contained in N. 

This theorem is proved in much the same way as when A has a finite 
basis. Let N, be any nilpotent invariant subalgebra of A other than 
N: N+ N, is then also invariant. If = N,, we have 


(N+N)" < N®+4N, 


for every positive integer n. If the indices of N and J, are finite, it 
follows immediately that N-- N, is nilpotent. If either index is trans- 
finite, we have 


(N+N,)” < N°+N°+N, = ™, 


from which we derive in the same way 


2 


(N+ N, yw NY. Ne’ + No L N, 


3 1 

and so on. If then » is the greater of the indices of N and N,, it follows 
that (V+ N,)”’ < Ns, which is nilpotent. Since N is maximal it follows 
that N,<N. 

If N is a nilpotent invariant subalgebra which is maximal with respect 
to the property of having a finite index (so that N is possibly contained 
in some nilpotent invariant subalgebra whose index is transfinite) the same 
proof shows that every nilpotent invariant subalgebra whose index is finite 
is contained in N; for if the index of JN is finite, so is also the index of 
each of its subalgebras and, in particular, the index of Nz in the above 
proof is finite. 

THEOREM 6.7. Every algebra A which does not have a modulus either has 
a singular invariant subalgebra or is itself singular. 

Let e be a principal idempotent element of A and let 


= Ay t+ Ay t+ Ay +A, 


be the Peirce decomposition of A relative to e. 


1924] ALGEBRAS WITHOUT A FINITE BASIS 413 


Suppose in the first place that 4,0. If some element of A, is not 
singular if A, A,,2 must contain an idempotent element for some a < A. 
But A,.2<A,,-+A,,; hence we must have an idempotent element of the 
forM Where < This gives 


NOW < Agos < Ap, and Ag, ~ Ay, = 0; hence z,, = and, seeing 

that A,, contains no idempotent element, it follows that 7,, — O and there- 

fore 2, Log%y, = 9. The elements of A,, are therefore singular in A. 
If Ago = 0, then Ayo + Ap; $0 since A has by supposition no modulus. 

For any «<A, 


(Ajo + < Ao; + Ajo Ao, 


and hence, if A,;o + Ao, is not composed entirely of singular elements, there 
must be an idempotent element of the form 7; -+2;; Where 2; < Apo;. 


~ Axo Ag Aw Aoi < Ajo Ano = O, 01 111 < Ayo Aor < Aoo Aor = O. 


This gives 


to = (91 +21 = 0, 


so that there is no idempotent element in Aj 9 + Ao. 
Hence in every case A contains singular elements and by Theorem 6.4 
also a singular invariant subalgebra. 
7. Simple algebras. The discussion of the structure of a simple algebra 
parallels closely the corresponding theory for algebras with a finite basis. 
THEOREM 7.1. A simple algebra, which is not singular, possesses a modulus. 
This theorem is an immediate consequence of Theorem 6.6 but, because 
the proof given there depends on Postulate 5.4, it seems worth while to 
give an independent demonstration. Since an algebra A which is not singular 
possesses a principal idempotent element e, we may express A in Peirce’s 
form 
A = Ago + Aro + Aor + Arr 
where 
eAyj = eAoj = 0 


If we set 
B, — + Ayo By = Ago + Aoi; 


(j = 6,1) 


J. H. M. WEDDERBURN [October 


A B, Bs A Bs 
A B, AA By A Bs Bb, 
B, Bs A B, A A B, A B, B,, 


and therefore, since A is simple, we must have B, B, = 0. This 


(Bi +B, = Bi + < Ao, 
A(B, + AB,+AB, = AB, < B, < B,+B,, 
(B, Bs) A= B, A Bs A= B.A Bs B, Bs. 


But A is simple; hence B, + B, = 0, that is, A = A,,, which proves the 
theorem. 

THEOREM 7.2. Jf e is an idempotent element of a simple algebra A, eAe 
is simple. 

For, if B is a proper invariant subalgebra of eAe, 


eABe = eAe-B-eAex BA; 


therefore 4£A<—A is a proper invariant* subalgebra of A. This is im- 
possible since A is simple. 
If e is primitive, it follows from Theorem 6.3 that every element of eAe 
has an inverse relative to e; such an algebra is called a division algebra. 
Let er, ¢ in a range G, be a complete primitive complementary set for 
the simple algebra A; then e— > & is the modulus of A and, if A,s = e, Aes, 


G 
the Peirce form of A is } > d,s. Since A is simple Ae,sA = A, as other- 
wise it would be an invariant subalgebra of A; hence, multiplying on the 
left by e, and on the right by e&, we have 


(7.1) Ars Ast 


We shall now show that, if z,s; and xs are any elements, different from 0, 
of A,ys and Ag respectively, then 2x,sxzs¢ +0. Since A is simple and & is 
primitive, Az is a division algebra for every ¢. Suppose that 2s xs = 0; 


* Cf. Scorza, loc. cit., p. 15. 
t Cf. Scorza, loc. cit., p. 76. It is not necessary here that the e, should be primitive. 


414 

then 


1924] ALGEBRAS WITHOUT A FINITE BASIS 415 


then z;sxst Ats—=0. But xs Ats — Ass, Which is a division algebra, and hence, 
for any wts<Aty such that rsteys0, there is a yss < Ass for which 
rst Yts Yss = @s. Hence, as 0, rrex%stAts = entails r¢ = It 
follows for every xts< Ats that x71; = 0, and therefore zts As = 0 by 
a repetition of the same argument. But 25 is any element of A;s so that 
our supposition that 7;s7s¢== 0 has led to At; Ase = 0, which contradicts (7.1); 
hence rs xst +0 unless one of the factors is 0. 

Using the same reasoning as above, only multiplying on the right instead 
of on the left, we see that there is an element ys, < As, for which ys 77s = es; 
also, if we set = 2rsYsr, then 


Lrr = Lrs* Ysr Ysr = TrssYsr = 


and therefore ,, = e,, seeing that the latter is primitive. Further, since 


Ast = Ysr Lrs Ast < Ysr Art < Ast, 
it follows that 


Ag = Ysr Art 
and 
Irs Ast = Irs Ysr Art = Art. 


We shall now show that it is possible to find a set of elements 
ers (7,8 in G, ers < Ays) such that ers ese = ert, Crsetu = 0 (s and 
e= Den. Set ez — e and let ep be any non-zero element of Apt, p being 
a fixed and ¢ a variable element* of G. We have already shown that 
there then exist elements etp such that etp epe = e¢ and ept ep = pp, and, 
if we set in general est = espept, it is readily seen that the elements so 
defined have the required property. We are now ready to prove the following 
fundamental theorem. 

THEOREM 7.3. Every simple algebra with a modulus can be expressed as 
the direct product of a division algebra and a simple matric algebra. 

Since Ast = esp Appept, there is a (1,1)-correspondence between the 
elements of each As: and those of a fixed Ap,, which, as we have seen, 
is a division algebra. The theorem then follows exactly as in the case of 
algebras with a finite basis.t We have also the converse theorem. 

THEOREM 7.4. The direct product A of a simple matric algebra B and a 
division algebra C is simple; and any element of A which is commutative 
with every element of A is an element of C. 


* Cf. Scorza, loc. cit., p. 78. 
7 Cf. W, p. 98; D, p. 76. 


416 J. H. M. WEDDERBURN [October 


The proof is the same as for algebras with a finite basis.* 

THEOREM 7.5. If S is the maximal singular invariant subalgebra of an 
algebra A which possesses a modulus, and if A—S is simple, A can be er- 
pressed as the direct product of a simple matric algebra and an algebra whose 
modulus is its only idempotent element. 

We shall not in the first instance assume that A has a modulus but 
only that A4—S is simplet and A+S. Let e&, ¢ in G, be a complete 
primitive complementary set of idempotent elements of A; the elements 
of this set are linearly independent modulo S since, were > S:e = 0 
(mod 8), we should have 


== Spep (mod 


whereas ep<{ S for any p. It follows from the proof of Theorem 7.3 (taking 
into account the second footnote of page 414) that A,s Ase = Art (mod 8). 
Now A;sAsy is an invariant subalgebra of A,, which is not singular seeing 
that it is not congruent to 0 modulo S; and since e, is primitive, it follows 
from Theorem 6.4 that any proper invariant subalgebra of A,, is necessa- 
rily singular; hence A,s As, = Ayre We have also A;; Ars = Ars because 
Ay, contains e, and, since Ars => Art Ats, 


Ays Age Apt Ats Ast Art An = Art, 


and also Ars Ast hence Ars Ast 

We must now prove that zy; As; = Arr when 2z;s is an element of A;s 
which does not belong to S. Now 2;sAsr = 0 (mod S); for, if this were 
so, then 7;s3Ass = Agr Ars 0 (mod which is impossible since 
Irs = Ass so that contains non-singular elements of 
A and, by Theorem 6.3, also elements which are not singular in A,,. Hence 
if x is any non-singular element of 2,5 As,, there is an element y of Arr 
such xy = e,, so that 


LY < Irs Agr Ary Lrg Agr , 


that is, to any zs <{ S, there is an element 2s, << As, such that xys“s = ey. 
It then follows as in the proof of Theorem 7.3 that 2,3 Ag¢ == Art, that 
there exists a simple matric algebra e,s, r, s in G, and that > > Ars is 


* Cf. W, p. 99; D, p. 79. 
+ It then follows from Theorem 7.1 that 4—S has a modulus. 


CD had 
i 


1924] ALGEBRAS WITHOUT A FINITE BASIS 417 


the direct product of this simple matric algebra and any 4,,. If we now 
add the condition that A has a modulus so that A = > > 4,s, the proof 
of the theorem is complete. 

The above discussion renders it probable that an idempotent element 
which is primitive in A corresponds to a primitive idempotent element in 
A—S; and in fact this can be shown by a somewhat roundabout argument. 
If, however, Postulate 5.5 is assumed, the proof is more direct and also 
contains some points of intrinsic interest. 

If = y 0 (mod 8) and we set z = f(y) + 
we readily find that a formal solution of the equation z* = 2z is given by* 


2V1+4y 


= (mod §), 


where Postulate 5.5 is required in order that the series used should exist 
as elements of A. If z is primitive in A but x does not correspond to a 
primitive idempotent element in A—S, we may write x — 2+ .22 where 


and we may, without loss of generality, suppose 2, modified as above so 
that 2? = x,, since this still leaves x =z (mod 8). If 2,7, = = 
we have = 2,7, = = and therefore x, (7,—y,,) = 0; we may 
therefore suppose y,, = 0 and, since then x, (23—z,) == 0, we may modify 


r, as before so that it is idempotent and still keep z,7,— 0. x, = 7,— 2,2, 


is then idempotent and z,x, = 0 = 2,27,. We shall therefore assume 
= 0 = 2, = X= x, and z = x,+27,+y where y<S. This 
gives 


and hence, seeing that z is primitive in A, there exist elements w, and w, 
such that 


=z, = 2 (mod S). 


The first of these congruences gives x,z2 = 0 (mod 8) and the second 


*If 2 = 0 in the field, we may set z = r+y+y*?+y'+y°+-::. 


2 2 

= = 


418 J. H. M. WEDDERBURN (October 


L,2 = 2; the supposition that the element in A—S corresponding to z is 
not primitive has therefore led to a contradiction. 

The methods which are used in the theory of algebras with a finite basis 
to show that a simple algebra cannot be singular except in the trivial case 
of the algebra that consists of one unit 2 for which * 0, depend on 
induction and therefore cannot be extended directly when a finite basis 
is wanting. We can however apply these methods to prove the following 
theorem. 

THEOREM 7.6. A simple commutative algebra is either a division algebra, 
or is the algebra of one unit x for which x2* = 0. 

If « is any non-zero element, we must have 4a = 4d, since otherwise, 
if Ax +0, it is a proper invariant subalgebra of A and, if Ax = 0, {x} = (2) 
is a proper invariant subalgebra of A, unless of course A (x). More- 
over, if z is any other non-zero element, we cannot have zz = 0 since, 
were this so, we should have 


A= Ar = Azx = 0. 


But, if da = A, there must be an element e such that ex — x; this gives 
(e®*—e)x = 0 and therefore e?—e — 0 so that e is idempotent. We have 
shown that the product of two elements can only vanish if one of them 
is zero; hence A is a division algebra with e as modulus. 

8. Reducibility. The principal theorems regarding the uniqueness of 
the expression of an algebra as a direct sum which are true of algebras 
with a finite basis also hold when no finite basis exists. Before showing 
this however we must first prove the following theorem, which is trivial 
when the basis is finite. 

THEOREM 8.1. Every reducible algebra A that has a modulus possesses an 
irreducible reduced part. 

Let e, ¢ in G, be a complete primitive set of idempotent elements. If 
Ars = e-Aes, then no A, is 0 since A,, contains e,. If for some fixed r 
every A,s and As,(s 7) is 0, then evidently A is reducible with A,, as 
one reduced part; and A,, is irreducible since the only idempotent element 
it contains is its modulus e,, while Theorem 4.4 requires that each part 
of a reducible algebra with a modulus shall also have a modulus. We may 
suppose therefore that A,,-+ As,+0 for some s +7; then that Ass, + As,s0 
for some s, = s, and so on. We define in this way a subset H of the range G 
such that (i) if s is a value of ¢ in H, there are a finite number of values 
of tin H, say s,, 82, ---, 8n, for which Ays, + + Asas+ Ass, 
are all different from 0, and (ii) all values of s which may be reached in 
this manner from r in a finite number of steps are contained in H. 


| 
| 
| 


1924] ALGEBRAS WITHOUT A FINITE BASIS 419 


Let K be the complement of H in G and set 


Sa. 


If e, Aeg +0, there must be some and q< XK for which Ap, $0; 
but p can be reached from y in a finite number of steps and hence q also, 
in contradiction to the definition of H and K. Hence e, Aes 0 and 
similarly e, Ae, = 0, so that A is reducible. Moreover e, Ae, — A” is 
irreducible; for if A’'= B@®C, then e,, being primitive, must lie in either 
B or C, say B, and, if es is any one of the original set of idempotent 
elements which belongs* to C’, s cannot be reached from r by a finite number 
of the steps used in the definition of H, since, if ep < B, eg<C, then Ap, 
and A,, are both zero.+ The proof of the theorem is therefore complete. 

We are now ready to prove the theorems referred to at the beginning 
of the section. 

THEOREM 8.2. Jf an algebra A which has a modulus is expressed in two 
ways as the direct sum of irreducible parts, these two expressions differ only 
in the order in which the constituent parts occur and both contain every 
irreducible reduced part of 

By Theorem 8.1 A possesses irreducible reduced parts; let Az, ¢ in a 
range G, denote these parts and set B = 2: Ar; B is then an invariant 


subalgebra of A since by Postulate 5.3 AB= SAA; = > At = B, and 
similarly BA = B. By Theorem 4.4 every A; has a modulus & and, by 
Theorem 4.6, A,As == 0 (7 +8); also, by Postulate 5.2, e — >'e exists, 
and it is evidently the modulus of B. Hence, by Theorem 4.3, B is a 
reduced part of A, say A B@®C. But, by Theorem 4.4, C has a modulus 
and therefore it is either itself an irreducible reduced part of A or it contains 
such a part. This is impossible since B~C = 0, and therefore C — 0, 
whence B = A. 

If now A = > Bs is any eae of A as the direct sum of irreducible 
parts, then A; = e¢Aer— If ce Bser0, then by Theorem 4.3 


et Byer = Bs, since it is an ‘inv ariant subalgebra of Bs which has a modulus 
and Bs is irreducible; also, since es < Az, which is invariant in A, it follows 
in the same way that e¢Bse¢ = At. Hence A; Bs, that is, every Ag 
occurs in the set Bs, and, since the set A; contains every irreducible reduced 
part of A, the set B, is only a rearrangement of it. 

* If e. is a primitive idempotent element, = and = e” are idempotent if 
not zero; but es = e’+e”", e’e” = 0 = e"e’ and ez is primitive; hence one of e’, e” is 0. 
Any primitive idempotent element therefore belongs to one or other of B and C. 

+ For instance, Apg = ep = Cpe; = O. 


| 
| 


420 J. H. M. WEDDERBURN [October 


THEOREM 8.3. Every algebra A which has a modulus either has no invariant 
subalgebra which has a modulus or can be expressed uniquely as the direct 
sum of such an algebra and an algebra which has a modulus. 

Let B’ be an invariant subalgebra of A which has a modulus; then by 
Theorem 4.3, A >’ @C", and therefore A possesses at least one irre- 
ducible reduced part which has a modulus. As in the previous theorem, 
the algebra B which is the sum of all the irreducible reduced parts of A 
that have a modulus is the direct sum of these parts and has a modulus, 
Hence, by Theorem 4.3, A B@®C. Here C has no modulus, since A 
has none, and it has no irreducible reduced part with a modulus, since 
any such part is also an irreducible part of A and so belongs to B, whereas 
B-~C 0. By Theorem 4.3 and its corollary, C’ has no invariant sub- 
algebra with a modulus and is unique. The theorem is therefore proved. 

THEOREM 8.4. An algebra which has no modulus and no invariant sub- 
algebra with a modulus becomes irreducible if a modulus is added to it. 

For if A* denote the algebra obtained by adding a modulus e to A, 
and if A* B@®C, then each of B and C has a modulus, say e, and es. 


Now a+ &e, y+ne, where x, y< A and &, are scalars. 
But e; é 0 and therefore &y 0. If, say, § = 0, then e, < A and 
therefore also B = e, A*e, < A; and this is impossible, as B would then 


be a reduced part of A with a modulus. 

9, Illustrative examples. We shall now give a number of examples 
some of which are given to illustrate the theory of the preceding sections 
and others because of their intrinsic interest. In most of these examples 
we shall denote the basis of A by a¢ where the variable ¢ runs through 


a range G; x at, y > 7 2, ete., will denote general elements 
G G 

of A and &(t), y(t), ete., the corresponding functions of ¢. The elements 

xt are linearly independent unless otherwise stated. 


Example 9.1. Let 


= @ (s + t). 
Then > (8+ ge) ve and wy == > & ye ae, and, in the notation of 
$2, &(a)<(a) E(a)y(a). The functional product in this case 


therefore corresponds to ordinary multiplication. 
Example 9.2. Let 
(9.201) Lt Ls 


We shall call the corresponding algebra the power algebra since this is 
what A becomes when @ is the set of positive integers. Here we have 


9 909 
G(s) 


1924] AGEBRAS WITHOUT A FINITE BASIS 421 


or 
(9.203) E(t)<q(t) = 


G (8) 


Ss). 


(9.21) Let @ be the set of positive integers; the order of A is then x. 
If a a,, then 7 a‘, so that this algebra is the algebra of one 
indeterminate, that is, it uses the Grassmann indeterminate or general 
product. If 0 is included in G, x» is the modulus and the algebra is then 
equal to {a, a}; {a} is the maximal invariant subalgebra. 

In this algebra the summation sign >» indicates ordinary algebraic 

summation, but no question of convergence is involved so long as the given 
basis is used. 
(9.22) Let G be the set of positive and negative integers and 0. Here a 
distinction must be made between the case in which > refers to sums 
with a finite number of terms only and that in which infinite sums are 
allowed. In the latter case A is, in some sense at least, equivalent to (9.21); 
for, if e te, 6 = e+ e+ a, then 


7 e—b, e+b+b?+.--, 
and so on. 
(9.23) Let @ be the interval a — ¢ -— © in the real continuum or, alter- 
natively, a — ¢<._o. As before we may set x a‘; the algebra has 
then a modulus only if G includes ¢ = 0. If certain integrability conditions 
are satisfied by the functions involved, >’ may be interpreted as ordinary 


G 


integration. For instance, if a — 0, we may set* 


*1 
(9.231) &(t)><y(t) = s(s)y(t—s)ds if E(ts)n[t(1—s)]ds. 


It is easily shown that this product is associative. 
(9.24) We may take for G any aggregate of sets of points such that the 
logical sum of any two sets is also a member of G. 

Example 9.3. An algebra closely allied to the preceding one is given by 


(9.301) Xe = 
(9.302) cy = 
or 
(9.303) E(t) <9(t) 
G 


*This product and the one given in (9.321) below are of course well known, usually 
with a somewhat more general range than that given here. 


49292 J. H. M. WEDDERBURN [October 


(9.31) If @ is the set of positive integers, A is the algebra used by 
Professor E.T. Bell* in the theory of numbers and called by him a Dirichlet 
algebra because of its connection with Dirichlet series. 

(9.32) If @ is the real interval e—¢< ©, and }> is interpreted as 


ik ) =, we get, when a = 1, 

‘ t ds 
which may also be written 


Is 
log E(B) y(t! 


J0 
Examples (9.2) and (9.3) belong to the special type 
Lt = p(r, p(s, t)) p(p(r,s), t). 
When @¢ is properly restricted, it follows from the theory of one-parameter 
continuous groups that any algebra of this type can be reduced to (9.2) 
by a change of variable. For instance, if in (9.2) we put 2} = ze, 


The forms (9.231, 321) may be also derived directly from (2.72) by putting 


and interpreting the corresponding integral as a double Stieltjes integral 
with respect to A(r,s, ¢). 
This type of algebra may be generalized as follows. Let 
be a set of functions having the group property 
Pi( P(r; t) = PCs, t)). 


An associative algebra is then defined by 


tye (8;t) Po (S;t) + +-nls;t) * 


* Cf. these Transactions, vol. 25 (1923), p. 135. 


1, ¢ = g(r,3), 
(r,s, t) =) 
) 0, ¢ 


1924] ALGEBRAS WITHOUT A FINITE BASIS 423 


Example 9.4. An example of a non-commutative algebra is given by 
(9.41) Xe = 
Modifications of this algebra are given by 


(9.43) J's -4-t—k + 43—t+ks 


(9.44) Xe = 

Example 9.5. Corresponding to the ordinary algebra of matrices we 
have the algebra generated by the units 7¢, where s and ¢ run through 
a range G and 


Son, == 


This is the simple matric algebra used in § 7. The corresponding functional 
product is 


(9.52) E(s, t) y(s,t) = J E(s,7r)y(t, t) dt. 


This algebra and its subalgebras have been developed to a considerable 
extent by writers on the theory of integral equations. 

Example 9.6. The Grassmann-Gibbs indeterminate product (the algebra 
of tensors) forms another important example. Here the units have the 
form 7t,t,.... the subscripts belonging to a range* G which includes 0 


while, if ¢; — 0, all subsequent subscripts are also 0, and the law of com- 
bination is 
(9.61) = (Sm + 0, tn 0). 


It is usual to add the restriction that the first subscript is not 0, but, if 
this is not done, zo... is the modulus of the algebra. The corresponding 
functional product is 


(9.62) E(t, te, tm) > te, t,) = E(t, te, tm) tm- n)e 


*This range is usually the set of positive integers and 0, but it may of course be 
taken to be continuous. 


(9.51) Tst = (0 t p 


424 J. H. M. WEDDERBURN (October 


Example 9.7. Let A be the algebra generated by a, b, ¢ where 
ab be = ca 


If we set / a*, we easily find that fis commutative with every element 
of A and that 
at = = -* = ta b, = tobe, ac = fea. 

When sums with a finite number of terms alone are allowed, this algebra 
has no idempotent elements in spite of the fact that A* =- A, which in 
the case of algebras with a finite basis always implies the presence of at 
least one idempotent element. If, on the other hand, infinite sums are 
allowed, A has exactly four such elements, namely 


(9.71) 1 , 
V1+14f+/* 

where 1/ V1+14f+/? is to be expanded in a series of positive powers 
of f, and the numerical term canceled before interpretation, and the signs 
are either all + or two — and one +. This algebra may be used to 
illustrate the Peirce form of § 5 but as the actual expressions are some- 
what cumbrous they are not given here. 

If e stands for any one of the elements in (9.71), eAe forms a commutative 
division algebra. 

Example 9.8. Let A be the algebra generated by two elements x and y 
for which 
(9.81) 


where « is a scalar different from 0 and 1, 7, s are positive or negative 


integers or 0, and 7° = y¥° = e is the modulus of the algebra. 
Let f(x) (r = 0,1,---) denote series of the form 
(9.82) (tore + + toy +--+) ( Gor + 0), 


where , is an integer, positive, negative or 0; /,(.”) has a unique inverse 
of the same type which is obtained by formally inverting the series for /,. 
The algebra is then defined as the set of elements of the form 


(9.83) 


where is an integer which may be negative. The sum and product of 
two such elements have the same form so that this set does in fact form 
an algebra. 


1924) 


ALGEBRAS WITHOUT A FINITE BASIS 425 

We shall now show that every element of the set except 0 has an inverse. 
We may evidently assume » = 0, and there is also no loss of generality 
in taking f)(2) = e. If we form the product 


(9.84) (e—ny—gy—:: 


and set f’” = »"fiy~", the condition that the result equals ¢ is 


== Is = Ss + 


and hence the g’s are uniquely determined, each has the form of (9.82) 
and the left hand factor of (9.84) has the form (9.83). Every element 
therefore has an inverse and hence 4 is a division algebra. 

It should be noted that no question of convergence is involved here, 
although as a matter of fact the convergence is easily investigated when 
ei<f. 


In place of (9.81) we may evidently take 
yu = O(xr)y 


where 6(z) is any function whose iterated powers are known for positive 
and negative indices. For example we may set 


(9.85) gx = 


which also gives rise to a division algebra under conditions similar to those 
imposed above. In this algebra y?2 = xy’, so that every element has 
the form y*)y. 

Example 9.9. Let A be the algebra generated by an element a which 
satisfies the equation 


(9.91) p(a) = 0, 


where g(§) is an integral function of § and the elements of A consist 
of all integral functions of a, that is, of the elements which result from 
substituting a for § in any integral function of §. The modulus of the 
algebra then corresponds to the unit 1 of the field and we shall use the 
same symbol for both. 

The function »(§) must vanish for some value of § in the field of complex 
numbers; for, if = expw(&), w(§) integral, then exp(—w/(§&)) 
is also integral so that exp(—wy(a)) = b is an element of A for which 
bp(a) = 1, which contradicts (9.91). If f(£) is an integral function 
of € which has no zero in common with g(§), the element f(a) of A has 


426 J. H. M. WEDDERBURN 


an inverse; for there always exist integral functions* A(£) and B(§) such 
that BCE) f(F) = 1, whence B(a)f(a) = 1. 

Let us suppose now that the roots of g(§) are all simple; the Mittag- 
Leffler expansion of 1/g(£) then has the form 


+G(§ 
p(&) Sai) + G(s) 
where a, = 1/9'(gn) and @(§£) is an integral function and, if we set 
9.92 y,(§) = | p(s), 
then 
(9.93) 1 = unl’) + (8) 48). 


Since w,,(§) has the form 
1+(§—gn) 0,(§) 
where 6,(&) is integral, it follows immediately that 


where 6,(&) is integral, and also it is clear from (9.92) that w,(&) Wm (§) 
vanishes for every root of gy(&); hence, if 


= 
we have 
= 0 (m + n) 
and from (9.93) 
= 1. 


The algebra is therefore equivalent to (9.1), provided of course that the 
field contains the roots of yg. The converse theorem is obvious. 

In conclusion we remark that, if the roots of g(&) are not simple, it 
can be shown in much the same way as in algebras of finite rank that A 
then possesses a nilpotent invariant subalgebra; also, even if g(&) has no 
roots in the given field, it easily shown that A possesses at least one 
idempotent element. 


* Cf. these Transactions, vol. 16 (1915), p. 329. 


PRINCETON UNIVERSITY, 
Princeton, N. J. 


DETERMINATION OF ALL THE PRIME POWER GROUPS 
CONTAINING ONLY ONE INVARIANT SUBGROUP OF 
EVERY INDEX WHICH EXCEEDS THIS PRIME NUMBER* 


BY 


H. A. BENDER 


The groups of order p”, p being any prime number, which satisfy the 
following conditions have been determined: (1) m < 77, (2) those which 
contain operators of order p*(« > m—4)ji, (3) those containing the abelian 
group of order p”~! and of type (1, 1, 1, ---)§, (4) those containing ex- 
actly p-+1 abelian subgroups of order p”—'||, (5) those containing ex- 
actly p cyclic subgroups of order p* 4]. The present paper is devoted to 
a complete determination of the groups of order p” which satisfy the 
condition that each group contains only one invariant subgroup of every 
index which exceeds p. There exists at least one such group for every 
value of m and p, viz., the cyclic group of order p”. This is, however, 
the only abelian group which satisfies the above condition. 

Let G be a non-cyclic group of order p”, p being any prime number, 
which contains but one invariant subgroup of every index which exceeds 
this prime number. Since G is non-cyclic it involves more than one sub- 
group of index p. The cross cut of any two such subgroups is invariant 
under G and hence is the invariant subgroup of index p®. Since G contains 
but one invariant subgroup of index p* this cross cut must include all the 
commutators of G as well as the pth powers of all its operators. From 
this it follows that the quotient group corresponding to this cross cut is 
non-cyclic, and hence any quotient group of G whose order exceeds p must 
be non-cyclic. 

* Presented to the Society, April 18, 1924. 

jm = 3, 4, 0. Hélder, Mathematische Annalen, vol. 43 (1893), p. 371. m = 5, 
G. Bagnera, Annali di Matematica, ser. 3, vol. 1 (1898), p. 137, and vol. 2 (1899), 
p. 263. m = 6, M. Potron, Théses, Gauthier-Villars, Paris, 1904. 

ja = m—1, m—2, W. Burnside, Theory of Groups of Finite Order, 1897, p. 75. 
« = m—2, G. A. Miller, these Transactions, vol. 2 (1901), p. 259, and vol. 3 (1902), 
p. 383. « = m—3 (p>2), L. IL. Neikirk, these Transactions, vol. 6 (1905), p. 316. 
« = m—3 (p = 2), Miss McKelden, American Mathematical Monthly, vol. 13 
(1906), p. 121. 

§ G.A. Miller, Bulletin of the American Mathematical Society, vol.8 (1901), p.391. 

|| G. A. Miller, Bulletin of the American Mathematical Society, vol.13 (1906), p.171. 

q| G. A. Miller, these Transactions, vol. 7 (1906), p. 228. 

427 


428 H. A. BENDER [October 


If G contains but one invariant subgroup of index p* this subgroup 
must be the commutator subgroup of G, otherwise it would include the 
commutator subgroup and the commutator quotient group would be non- 
cyclic abelian and of a larger order than p*. Since this quotient group 
would then contain more than one invariant subgroup of index p’, it follows 
that G would contain more than one invariant subgroup of index p*, contrary 
to hypothesis. Moreover, if the commutator subgroup of a group of order 
p™ is of index p® the group contains only one invariant subgroup of this 
index, since the commutator subgroup is found in every such subgroup. 

We shall represent the invariant subgroups of orders 1, p, p*, p’, ---, 
p”-* by Go, Gi, Ge, Gs, ---, Gm—s respectively, and the operators of Ge not 
in Ge. by the major co-set Ge-is«e*. Since an operator in the major 
co-set Giso has but p conjugates there are p”—? operators commutative 
with G., and we shall represent the subgroup composed of these operators 
by 

It has been shown that G,,—2 is the commutator subgroup of G. The second 
commutator subgroup is a subgroup generated by all the commutators of 
the group which have for one element a commutator while the other element 
is an arbitrary element of the groupy. It is evident that this second 
commutator subgroup is invariant under G and hence is one of the invariant 
subgroups. Suppose G,—« to be one of the successive commutator subgroups 
and suppose the commutators of G, which have for one element an operator 
of Gn—« while the other element is an arbitrary element of G, to generate 
the invariant subgroup G,,_.—3. The quotient group of G with respect 
to G,,¢—s Will at least contain a central of order p?. Hence if 2 is 
greater than one, G@ will contain more than one invariant subgroup of 
order p™“*", From this it follows that the «th commutator subgroup 
is the invariant subgroup Gm—e—1 (« = 1, 2, 3, ---, m—2). 

If the (m—2)th commutator subgroup is of order p, it implies that 
the first commutator subgroup is of index p*, the second of index p’*, ete. 
If a group which has only one invariant subgroup of order p*, which is 
also one of the successive commutator subgroups, had more than one 
invariant subgroup of order p*—', then the next successive commutator 
subgroup would be contained in each of these invariant subgroups, and 
hence would be of a lower order than p*—?. Hence the following theorem: 

A necessary and sufficient condition that a group G of order p™, p being 
any prime number, contain only one invariant subgroup of every index 
greater than p is that its (m—2)th commutator subgroup be of order p. 

As an interesting system composed of groups of order p” such that 


* American Journal of Mathematics, vol. 45 (1923), p. 231. 
+ W. B. Fite, these Transactions, vol. 7 (1906), p. 61. 


1924] PRIME POWER GROUPS 429 


each group contains only one invariant subgroup of each of the orders 
p®,+--, p™*, we may note the Sylow subgroup of order p?~? 
contained in the symmetric group of degree p?. It is obvious that such 
a Sylow subgroup contains a subgroup of order p? which is the direct 
product of p regular cyclic groups of order p. If the generators of these 
p regular groups are represented by s1, s2, 83,°+-, Sp, respectively, and 
if ¢ represents the substitution of order p and of degree p* which satisfies 
the condition 
(“af = a, = &, t—'s,t = a1, 


it is evident that ¢ and the said p generators give rise to the following 
p—1 commutators: 
Spe 


These commutators generate a group of order p”—' which is therefore the 
only invariant subgroup of index p? contained in the group. The second 
commutator subgroup is of order p?—*, etc. It follows from the preceding 
theorem that the Sylow subgroup of the symmetric group of degree p® 
contains only one invariant subgroup of each of the orders p, p*, p*,---, p?. 

The operators of G which transform the operators of Ga(@<m) into 
themselves multiplied by operators in @,_, constitute an invariant subgroup 
of G. For suppose #4, to be such an operator, and let s be an operator 
of Ge and ¢ any operator of G, and t-!st = s'’s; then 


= st,t-? = 


It follows from this that all the conjugates under G of 4 transform the 
operators of G« into themselves multiplied by operators in @,_, and hence 
they constitute an invariant subgroup of G. 

Since Gm—1 is commutative with G, it transforms the operators of the 
major co-set Gis. into themselves multiplied by operators in the second 
major co-set which precedes. Let us suppose the major co-set Ge-1s« to 
be the first in which the operators are not transformed into themselves 
multiplied by operators in at least the second major co-set which precedes. 
It is assumed that G,»—, transforms the operators of Ge—1 into themselves 
multiplied by operators in Ges, and that some of the operators of Gm—1 
transform the operators in the major co-set Geis. into themselves 
multiplied by operators in the major co-set Ge—ss«—1. 

All the operators of G which transform the operators of Gz. into 
themselves multiplied by operators in Ge—s form an invariant subgroup 
of G, say H. Suppose Gm—; to contain an operator ¢, which transforms 
the operators of the major co-set Ge-1s«¢ into themselves multiplied by 


430 H. A. BENDER [October 


operators in the major co-set Ga—ssa—2. Evidently the pth power of ¢, will 
transform the operators of G. into themselves multiplied by operators in 
Ges, and hence the pth power of ¢#; is in H. The group generated 
by H and 4 will contain all the operators of G which transform the 
operators of Gq into themselves multiplied by operators in Ge-2. For 
suppose ¢, transforms sq into itself multiplied by some operator in the 
major co-set Gu—sse—2; then there exists some power of 4 which will 
transform se into itself multiplied by an operator in the co-set containing 
the inverse of the commutator of ¢ and s«. The product of 4 to this 
power and ¢, will transform s« into itself multiplied by an operator in 
Ge-3, and hence this product is in H. Thus H-t contains all the 
operators of G@ which transform the operators of G into themselves 
multiplied by operators in Gas. 

In the same manner it can be shown that all the operators which trans- 
form the operators of Gz into themselves multiplied by operators in Ge_, will 
generate an invariant subgroup whose order is p times the order of H-t,, 
and as we have seen this must be @ itself. Hence we have shown that G»,-1 
can not contain an operator which will transform s¢ into itself multiplied 
by an operator in the major co-set Ge—2 se-1 and at the same time contain 
another operator which will transform s« into itself multiplied by an operator 
of the major co-set Ge—s 

It should be noted that the group generated by the commutators of G 
which have for their elements operators of any two invariant subgroups 
is invariant under the original group G. From this and what precedes it 
follows that the operators of Gm-1: transform the operators of Ge into them- 
selves multiplied by operators in Ge—2 (a = 2,3, 4,---,m—2). Further- 
more, each operator of the major co-set Gm—i 8m transforms the operators of 
the major co-set Ga—1 s« into themselves multiplied by operators in the major 
co-set (@ = 1, 2, 3,---,m—1). Thus it follows that whenever 
a non-cyclic group of order p™, p being any prime number, contains only 
one invariant subgroup of every index greater than p, it must also contain 
a subgroup of index p which includes all of its operators whose orders 
exceed p*, and the pth powers of every operator not in this subgroup must 
be in the invariant subgroup of order p. It should be noted that every 
operator in the commutator subgroup is a commutator. 

Let Gz be the largest invariant abelian subgroup of G. It is evident 
that the operators of the major co-set G3 s,,, must be commutative with 
the operators of some subgroup of Gs, say G@,(«@<), but not commutative 
with the operators of the major co-set Ge sei1. The commutators formed 
by the operators of G,,, and Gs,, are in G, and hence are invariant 
under G3,,. Suppose 


PRIME 


POWER 


GROUPS 


—] 


where s, is some operator of G,: then 


Since the pth power of s; , is in @ 3 it must be commutative with s,.,, 

and hence s, must be an operator ot order p. Furthermore each operator 

of the co-set containing s;,., transforms every operator of the co-set 

containing s,., in this manner. Hence the commutators formed by the 

operators of G,., and @3., generate the invariant subgroup of order p. 
Again let us consider the commutator 


and suppose ¢ to be an operator of the major co-set Gm—, s, such that 


gf Si—1 (7 m—1); then 

1 1 1 1 -1 1 —1] 1 
/ / 1 Se 1 +3 1 Sa 1 / S3+1 Su ‘3 Se 1 

Let us now consider the commutator 

1 —1 
+3 1 So 

where s, is some operator of Ge 1. Transforming by ¢. 


Thus it follows that the operators of the major co-set G3 s3,, transform 
the operators of the major co-set G,., %,.. into themselves multiplied by 
operators in the major co-set G, ss. 

Since this property must hold for the quotient group, and if we form the 
successive quotient groups with respect to the invariant subgroup of order », 
it follows that the operators of the major co-set G3 s;,, transform the 
operators of the major co-set @,,, %+»;, into themselves multiplied by 
operators in the major co-set G, s,.,(@ = 0,1.2,---,8—«—1). That 
is, the operators of the major co-set G3 s3,, transform each operator of G; 
into itself multiplied by an operator in the eth major co-set which precedes 
the major co-set containing this operator. 

The transformation of any operator s, of Gz by the pth power of ¢ is 


, 


> 
‘ ‘ 


1924) 431 
= 
30 


432 H. A. BENDER [October 


where the elements with zero or negative subscripts are unity. Since the 
pth power of ¢ is commutative with every operator of G, and if we let s, 
represent successively an operator in the major co-sets Gs, Goss, - 
++, Gp 8p (8 = p) it follows that all the operators of G,—1, except identity, 
are of order p. If sp is an operator in the major co-set Gp spi1 (8 = p +1) 
it follows that s, s? 1, and hence all the operators of the major co- 
set Gp: sp are of order p* and have their pth powers in G,. Since this 
property must hold for the successive quotient groups with respect to the 
invariant subgroup of order p, it follows that all the operators of Gap—1.) 
not in @p-1 are of order p*, and so on, and the pth power of each operator 
is in the (y —1)th major co-set which precedes the major co-set containing 
this operator. Hence it follows that « ~ 8—(p—1). 

Let us now assume that @ does not contain an invariant abelian subgroup 
of index 

Since all the operators of G@ commutative with the operators of Ge+1: form 
an invariant subgroup of @, it follows that the operators of the major co- 
set (@3., 53. can not be commutative with G,.,. Let us suppose the 
operators of G3,, to be commutative with the operators of G3. As before 
the commutators formed by the operators of @;., and @ 3.. generate the 
invariant subgroup of order p. 

If d = «, then some operator of the major co-set @3,, sg... would trans- 
form s,, into itself multiplied by s>' and the product of this operator 
and sz. would be commutative with G,,,,. The pth power of this operator 
is in G3 and hence would with (; generate an invariant subgroup of order pA 
which differs from contrary to hypothesis, and hence 6 —e. 

In general we may suppose G, to be the largest invariant abelian sub- 
group of G composed of operators which are commutative with every operator 
of G,(o~-8); then the operators of the major co-set G,_, s, transform the 
operators of the major co-set (sg, , into themselves multiplied by operators 
in G,. Let us assume the central of @, , to be G@j_, (7 > 0), and suppose 


Transforming by ¢, and since s, is not commutative with sj'.. for y = 1 
but is commutative for y>1, then for ;~ 1 


and hence s is in the major co-set G;, se. 
It follows in either case from the successive quotient groups that the 
central of G,,, is contained in the central of G, and the two are distinct. 


= 
1 et 


1924] PRIME POWER GROUPS 433 


and that all the operators in the major co-set G, s, , will transform every 
operator of G, into itself multiplied by an operator in the éth major 
co-set Which precedes the major co-set containing this operator (¢ = @+1. 
8+2,---,m—2). Furthermore, the commutator subgroup of G,,—; must 
be composed ot operators of order p, and it can at most be of order p? °. 

Let us now suppose #-— », and suppose the operators of the major co-set 
(1489.1 to be of order p* (@--p.¢-_m—1). It would then be possible 
to construct a group having an invariant abelian subgroup of order p”. 
and having G, for an invariant subgroup of its quotient group with respect 
to the invariant subgroup of order p”~*, and hence the pth powers of the 
operators of the major co-set G, ¢-38, ¢—3-1 Would be at most in the 
oth major co-set which precedes, and, as we have seen, this must be the 
(p—1)th major co-set which precedes. Hence the properties established 
above hold for any value of ~. 

The index of the largest invariant abelian subgroup of G can not exceed 
pen? and the order of this subgroup can not be less than p™*? for 
m odd, or for m even. 

If m >> p, then can be generated by » —1 independent generators 
which are such that the cyclic groups they generate have only the identity 
in common, and the ratio of the orders of any two of these independent 
generators is either 1 or p. If G is of a lower order, then all the 
independent generators are of order ». 

Since 


it follows that all the operators of the major co-set G,—2¢ have the same 
pth power, and hence (fs¢—1)” = ¢? whenever se—1 is preceded by p— 3. 
or less, independent generators 


1 


(tSm—1) Sm—1 t Sm—1 Sm—1 Sm—p Sm—p+1*** Sm 


Since this is a product of operators in Gyn—p, and s,»—p+1 is an independent 
generator, it follows that ¢ can be so chosen that this product is either 
the identity or an operator of order p in G;, except in the case where 
Sm-p 18 the only operator in this major co-set, i.e., for m—3 and p= 2. 
Hence for a given m and p the order of every operator of Gm-—. is deter- 
mined. For m > p there are three groups containing the same subgroup 
of order p” ', either all the operators of the major co-set Gn—1 sm are 
of order p, or all are of order p*. or one pth of them are of order p and 
the remaining operators are of order p*®. For m <j» there are only two 


30* 


434 H. A. BENDER 


such groups, either all the operators of this major co-set are of order p, 
or all the operators are of order p*. 

We shall now consider all the possible subgroups of order p”’~ -p+1). 
If Gm—1 is abelian there is only one possible subgroup of order p”~'. 
If Gm-2 is abelian, then the central of Gm—i1 can be either Gn_»,. 
Gm—pi1,-++, OY Gm-;. Henee there are p—2 subgroups of order p”— 
containing an invariant abelian subgroup of order p”—°, without containing 
an abelian subgroup of a larger order. If @,,-s is abelian, then the central 
of Gm—2 can be either Gn—p 2, OY and the number 
of groups of order is 1,2,3,---, p—A4. respectively (p >- 3). This 
process may be continued, and hence it tollows that the number of subgroups 
of order p™— containing an invariant abelian subgroup whose order is 
exactly p”-” is the sum of p—2(+*—1) terms of the figurate numbers 
of the (7 —-1)th order. Hence the number of subgroups of order p”~ is 


1 


(p—3)(p—4) (p—4)(p —5) (p—B) 


The sum of this series is the (p-+1)th term of the Pisano recurrent 
sequence*. This follows immediately, for if to the first term of the series 
for p equals p—2 we add the second term of the series for p equals p—1, 
etc. there results the series for p equals p. Hence the number of non- 


abelian groups of order p™ containing only one invariant subgroup of erery 


index which exceeds this prime number is 
3 i+ 1—_V5\" 
| — | (m > p-+1). 
Vd 2 2 
3 q 1 A 1 5 m—2 
V5 
2 1+V5\"" (1—V5\"~ 
V5 2 2 
with the single exception i 3 and p Zs 


*L. E. Dickson, History of the Theory of Numbers, vol. 1. chapter xvii. 


UNIVERSITY OF ILLINOIS. 
UrBana, 


2 


INVARIANTS OF THE LINEAR GROUP MODULO 2 = pp’... p»” 


n 
BY 


CORNELIUS GOUWENS 


1. INTRODUCTION 
The object of this paper is to obtain a fundamental system of polynomial 
invariants with integral coefficients of the linear group in q variables with 
respect to an arbitrary modulus z. 
For the case in which z is a prime »; Dickson? proved that a fundamental 
system is given by 


(s =1,---,q—1) 
where 
q—1 q—1 
1 q i q 
Ty 
1 q 
Ky FY ry 


Mrs. Ballantine proved that for m= p, ps-++ pn. q— 2. every invariant 
is of the form 


+ 


where i; is an integer and g; is a polynomial with integral coefficients. 


Feldstein§ proved that for «— p* a fundamental system is given by 


> pee b, pe 
, (¢=1,---.qg—1), B, =ypL 


i, 4,8 i,g,a,b.j i q 


1), 


where « and bs range over 0,1.---,p—1, but may not all be zero. 


* Presented to the Society, April 19, 1924. 

+ Madison Colloquium Lectures, p. 39. 

+ American Journal of Mathematics, vol. 45 (1923), pp. 286 ff 

§ These Transactions, vol. 25 (1923), pp. 223 ff. The notation Ri,o,«,1,; was introduced 
by the present writer. 


435 


436 CORNELIUS GOUWENS [October 


In the present paper it is shown that the method of Mrs. Ballantine can 
be extended from 2 to y variables. After a simplification of that method 
which enables us to avoid the use of the actual coefficients of the trans- 
formation employed the conclusion reached is the theorem 

Every invariant of the group © of classes of transformations with deter- 
minant congruent to unity, modulo a = pe pe. + p, is a sum of mvariants 
of T, modulo nm, each of which is expressible as a product of m, = [pis by 
an invariant of the group H; of classes of transformations congruent to 
unity, modulo py, and conversely, ecery such product is an invariant of 1. 


2. THE Groups G;, 

We call two linear transformations congruent modulo .r if their corre- 
sponding coefficients are congruent. All transformations congruent to a chosen 
one T, modulo 7, are said to form a class [7'],. The classes [7], with 
determinant |7 = 1 (mod) are the elements of a group /. 

Let p; be a prime factor of a and let P = yp be the highest power 
of pi Which divides +. Write +7 = m P. Let G; denote the subgroup 
formed of those classes of transformations of /’ which are congruent 
modulo m; to the identity transformation 7. Hence G; is composed of 
the classes 


(1) T IT (mod m;), T 


1 (mod P), 


the final congruence being a necessary and sufficient condition that 

T | 1 (mod 7), when 7’ T' = 1 (mod m;). 

Our investigation is based on the theorem that G; is simply isomorphic 
with the group H; of all classes [S]p (mod P) of transformations S whose 
determinants are congruent to unity modulo P. First, all transformations 
in a class (1) are congruent modulo + and hence modulo P, and therefore 
in a class [|S]p. Second, two transformations 7’ and 7; in different 
classes (1) are in different classes [S]p. For if 7’ T, (mod P), then 
T I T, (mod m;) implies 7 T, (mod « = mP). Third, there 
is a class (1) which corresponds to any given class [S]p. For we can 
find 7 (unique modulo 7) such that T S (mod P), T I (mod m;) 
since we can find an integer (unique modulo 7) which is congruent to 
two assigned integers with respect to the relatively prime moduli P and m;. 
Hence the classes (1) are in (1,1) correspondence with the classes [S]p. 
Finally, if 7; Ti, Te T> (mod 7) where all four 7’s satisfy the 
congruences (1), then 7; = 7, 7, = 717: = 73 (mod a) and T; and 73 
satisfy the congruences (1). Hence the product |7;],[7.], of the two 
classes (1) is uniquely defined as a class [7;],. Since the foregoing 


| 
| 
| 
| 


| 


1924} INVARIANTS OF LINEAR GROUPS 437 


congruences hold also modulo P. we have [| 7; 72 |p [73|p. Sinee 
pi was the highest power of »;, any one of the » distinct prime factors 
pi of a, we have 

THEOREM I. Jn the group IY’ of all classes of transformations with 
determinant congruent to unity, modulo a, the subgroup Gi of all classes 
of transformations congruent to the identity transformation modulo 
mi = | pit is simply isomorphic with the group H; of all classes of trans- 


formations molulo p* with determinant congruent to unity modulo p’. 


3. THE GROUPS G,, GENERATE 

We shall now prove the following 

LEMMA. The products T, T.--- 7; are all distinct when T,, Ti range 
over the classes of transformations of G,, Gs, «++, Gi respectively, and 
(for ix<n) these products form the subgroup Ji of classes | Ui), of trans- 


formations U; of TV which are congruent to the identity transformation 


modulo 1; = a/(ps This is true by definition where 7 
that is J; m,. Suppose it true when the above 7 is replaced by /—1. 
Then first. the groups Jj-1 and G; have no class in common save that 
of transformations congruent to the identity transformation modulo 1. 
For, suppose 


= viz., Uses 7; (mod 7). 

But 
T; (mod m,;) 

and 
(mod /;—1) 


and hence, since p*' is a divisior of both /;-; and 7, we have 
Since m; is prime to p* and their product is 7 we get 
T; I (mod 7), 


Further, the classes are all distinct where Uj;-,, 7; range 
over representatives of the classes of transformations of J;-1, G; respect- 
ively. For, if 

| Til, THe. 


then 
(mod ) 


| 
| 


438 CORNELIUS GOUWENS [October 


and 
U; T; T; (mod7), 


By the preceding result we have 


I (mod zr ) 
and 
therefore 
T; (mod st ) 
imply 
| 77 |, 


The product of two transformations Uj-:, TJ; belonging to J;-; and 
G; respectively is a transformation U’; of the class |U;],, Ui = J (modi), 
U! = 1 (moda). For 

(mod/;_;). 


T; (mod m; ) 
imply Uj-s Ti I (mod/;), since /; is a divisor of both /;-; and mj. 
Conversely, given a transformation U7; of the class [Ui|,, U; — J (mod/;), 
U; 1 (moda), we can find U;, and 7; (unique modulo) such that 
U3-1 7; U; (moda). Now U; = J-+- Kl; where J is the identity matrix 
and K is a known matrix. Take 


U; 1 1. T; af K mj. 
where the integers s, + are solutions of 
(1) l;. 
This last equation is solvable since /; is the greatest common diviser of 
and m. Then 

I+ Kl;+rs 

U; (mod), 
since /;-; m; is divisible by a. Also Uj;-1, and 7; are of the form 


and 1+ respectively. Then, since U; 1 (mod7z). 
we have 


(1 + (1+ arm) (mod 7). 


1924] INVARIANTS OF LINEAR GROUPS 439 


that is 


0 (moda). 
By (1) 
hence 
rh + slin 0 (mod). 
But /,_, is divisible by ps, therefore x/. is divisible by pi. Since /. is 


prime to pe, it follows that w is divisible by ps and is of the form > pi 
The determinant 7’, is therefore of the form / rp m,, hence congruent 
to unity, modulo. Therefore also 1(mod). This completes 
the induction. 

When / n the first half of the lemma still holds; thus all the products 
7, T.--- T, are distinct. where 7;, 72, ---. 7, range over representatives 
of all the classes of G,, Gs, +--+, G@, respectively. The order of 7’ is thus 
the product of the orders of the subgroups G,, Gz,---. Gn. Hence 

THEOREM LI. The total group 1 of classes of transformations with deter- 
minant congruent tou unity, modulo 1, is obtainable by composition of the 
n subgroups Gi cach composed of those classes of VT whose transformations 
are congruent to the identity transforination, modulo 


4, DETERMINATION OF THE INVARIANTS OF PF 
Let J(a,,+-+, 4,4) be any homogeneous rational integral function with 
integral coefficients which is an invariant of /’ modulo 7, that is 


+++, 4) +++, (mod 7), 
where 
| 


In particular, /(2,,---,.,) is invariant under every class of transform- 
ations [7'|., 7 — J(modny), | 1(mod), that is under the coincident 
elass of transformations |S]p. /(modm;). |S 1(mod P). By the 
isomorphism proved in Theorem I, when 7; ranges over representatives of 
all the classes of Gj, S; ranges over representatives of all the classes 
of H; and thus /(,,-+-,.,) is invariant under all the transformations 


of H; (mod (/ a). Conversely, if J(a,,---,.,) is a rational 
integral invariant under the group H; of all transformations S of classes [S]p, 
Ss I(modm;), 8 1(mod P), we see again by the isomorphism in 


Theorem | that /(0;,---,.,) is invariant under the corresponding G; of 


440 CORNELIUS GOUWENS 


the classes 7; I(mod mj), 7; 1(modz). For it is invariant 
modulo pe and unchanged modulo m,, therefore invariant modulo 7. 

Since by Theorem II the subgroups G;(/ = 1, ---, 2) generate the total 
group /’, modulo +, if any rational function with integral coefficients is 
invariant under every H;, modulo pe (¢=1,---.) it is an invariant of [ 
modulo 7. Hence we have 

THEOREM III. <A necessary and sufficient condition for the invariance of 
a rational integral function I(a,,+++, with integral coefficients under 
the group 1 of classes of transformations with determinant congruent to 
unity, modulo «= ps pa is that be invariant under 
every group H; of classes of transformations with determinant congruent to 
unity, modulo pe (7 1, 2). 


Thus 


A—1 
(1) a) g\ Li 


| q J; 


iq. *i,g,a,b,j 


where g; is a rational integral function with integral coefficients. Since the 
greatest common divisor of the m:(/ = 1,---,m) is unity there exist 
integers A; such that 


and each k, is prime to the corresponding pe, as otherwise the left hand 
member would be divisible by p*. 

Multiplying each of the equations (1) by the corresponding /; m; and 
adding we have 


n yr! phi 1 x 
i= 


As kg, is an invariant of H,(modp*) and does not vanish modulo p’ 
unless g; vanishes modulo p/ we have finally the theorem stated in the 
introduction. 


IowA STATE COLLEGE, 
Ames, Iowa. 


| 

n 

> k; l 

1 
| 
| 
| 
j 
| 


A UNIQUENESS THEOREM FOR THE LEGENDRE 
AND HERMITE POLYNOMIALS* 


BY 


K. P. WILLIAMS 


1. If we replace y in the expansion of (1+y)~-” by 21z+2*, the 
coefficient of 2” will, when « is replaced by —.x, be the generalized 
polynomial L;,(a) of Legendre. It is also easy to show that the Hermitian 


polynomial H,,(~), usually defined by 


re 
( 
da" 


H,,(x), 
is the coefficient of 2“/n! in the series obtained on replacing y in the 
expansion of by the same expression 272+ 2°. Furthermore, there 
is a simple recursion formula between three successive Legrendre polynomials 
and between three successive Hermitian polynomials. These facts suggest 
the following problem. 

Let (tls Us 

and put 


To what extent is the generating function g(y) determined if it is known 
that a simple recursion relation exists between three of the successive 
polynomials Py, P,(7), Ps(x),---? We shall find that the generalized 
Legendre polynomials and those of Hermite possess a certain uniqueness 
in this regard. 

2. We have 


(fit 
= p(2xaz+2*) 


n! dz * 


When we make use of the formula for the nth derivative of a function 
of a function given by Faa de Bruno,+ we find without difficulty 


(9 
= > itj! (2a) 


Presented to the Society, October 25, 1924. 
+ Quarterly Journal of Mathematics, vol. 1, p. 359. 
441 


| 
| 
| 
| 
. 
| 


442 K. P. WILLIAMS [October 


where the summation extends to all values of ¢ andj subject to the relation 


a tn —1 
Par) (2.r)" + = + (2a)"-*- 
(n—2): (n—4)!i 2° 


It is seen that while 7, is an even or an odd function, the coefficients 
of the generating function that enter into it form a certain consecutive 
group, a fact which has important consequences. 

3. Let us denote by A,’ the term in P,(x) that is of degree m in x. 


We see that 


the expressions being valid for / —1,0,1,2,--- if we agree that 
0. when m~n. The notable fact is that 42-7, both 


contain a,—;, but Ans’ contains a,—;-1. 


Let k and 7 be multipliers, which we shall assume to be polynomials 
in » to be determined; then 


2al Ange = [Int 


n+1 
a formula valid for = 0,1, 2,---. Let 


= In+(k— +k. 
We see that 
w(—1) (n--2)7, 


and when » is even, that 


/ 2) 
When developed, the expression is 
n—2j 
An 
| 
/ 
w | | 5 
| 
| 


1924] LEGENDRE AND HERMITE POLYNOMIALS 443 


This shows that, / being another polynomial in to be determined, 
1 


where n’ n/2, if n is even, and n’ (n—1)/2 it » is odd. 
4. We see from the last expression what must be the character of the 
recursion relation.” and that for it to exist we must have 


-1 f(t) 


where g(x) is a polynomial in x. In order that the summation on the 
right vanish, it is necessary that 


= glu—y) a(n). 


6(n) being a polynomial in x. The polynomial (2) is then given at 
onee by 
A(n). 


It is easy to determine 7 and /, so that w(;) will have the desired 
form. Since g(2—,/) is of the same degree in / that @(m) is in n, and 
since w(/) is linear in /, we see that g(#) must be linear in x. 

Put 

(ir) 


Then 
In+(k—-209+k 
This is to be an identity in both » and j. Put / —1. and we find 
(n-+2)/7 


Since « and 8 are arbitrary it follows that 4(7) must contain »+ 2 as 
a factor, and 

n+2° 


= (en+e«+ 


*It is evident that a linear recursion relation will not exist unless the factor 2. is 
introduced as in the middle term above. 


& 
| 
‘ 


444 K. P. WILLIAMS [October 


It follows then at once that 


= (an--2B) 
n+ 2 


No loss of generality results trom putting 
The polynomials will therefore have the recursion relation 


(n +2) B) Pails) —(en-+28) Py lr) 0), 
if 
Onis = 8) Gn. 


Taking do 1. we have for generating function 


ply) Fla, a, ay) (1 ay) te FO, 


where F' represents the hypergeometric function, and 


ply) PY, if « 0. 


These then are the only types of generating function that will give 
a recursion relation, with the conditions that /. 7. and / are polynomials 
in 

5. A further remark might be made about the case « + 0. 

We have 


(2427+ 2") Pi (ar) + z+ P3(a:) 2? 


Also we tind 
(y) (1 - B-@ly). 


and ean then deduce 
When this is combined with the recursion formula we have 


"It would evidently be no more general to take h, l, k rational in n. 


1924] LEGENDRE AND HERMITE POLYNOMIALS 445 


a relation independent of « and 8. From this and the recursion relation 
we can obtain 


(1+ es*) 0. 


Now the ditferential equation 


(1 @a*) nen +28)y 0 
dx 
is changed into 
d*y dy 


by putting « — V —e-x. y = B/«. But this is the differential equation 
satisfied by the generalized Legendre polynomials. 

6. It is evident that we can now state the following theorem: 

Let 


, 


und put 


The only cases in which there will be a recursion relation of the form 
h(n) — 21 Paola) — k(n) 0. 


where h(n), Un), and k(n) are polynomials, are essentially where we have 
the generalized polynomials of Legendre, and the polynomials of Hermite. 


INDIANA UNIVERSITY 
Bioominaton, INp. 


A NEW TYPE OF CLASS NUMBER RELATIONS’ 


BY 


T. BELL 


1. Beyond the well known remark that the tenth of Kronecker’s class 
number relations is equivalent to the theorem of Gauss which gives the 
number of representations of an integer as a sum of three squares, the 
elass number relations implicit in the representations of integers as sums 
of an odd number of squares have not been developed. The remaining 
relations obtainable from this source are of types different from any in the 
literature, and are only remotely connected with elliptic functions. They 
are of two kinds: relations involving in addition to the class number 
functions either (1) only functions of the real divisors of integers, or (2) 
functions of the complex divisors of integers or of the representations oi 
integers in quadratic forms containing more than three indeterminates, e. g., 
» abed, the > extending to all integers a,b,c, d the sum of whose 
squares is a constant. The second kind is of but slight interest, as there 
obviously is no difficulty in obtaining any number of such relations. But 
those of the first kind are much less common and in fact appear to be 
finite in number. being furnished by representations as sums of 5, 7, 9. 11, 
13 or 15 squares, but not by representations as sums of 17, 19, 21, 23 
or 25 squares, and probably by no higher odd number of squares. 

The relations of the first kind fall into two distinct species. The first 
of these comprises equalities, for different q, r,s, t, 1. A, between two or 
more sums of the form 


(A) > —sa) N(ta—u), 
a 


where n, 7,8. f. are constant positive integers, is a function of 
the real divisors alone of n for all integers 2. A is one of 2, F, F,. G. and 
the summation extends to all a 1, 2.3,--- that make both qn +r—sa 
and ta—u 0. In a paper not vet published I have made a complete 
determination of this species. 

The second species, in many ways more interesting than the first. is as 
follows. In general when x is unrestricted no sum of type (A) in a re- 
lation of the first species is reducible to a function of only the real divisors 
of a single integer. But when » is of the form eni. Where ¢ is a constant 


* Presented to the Society. San Francisco section, October 25, 1924. 
446 


CLASS NUMBER RELATIONS 447 


integer, the sums are in general so reducible. A relation of the second 
species is then one in which a sum of type (A) with m replaced by en} 
is equal to a function of the real divisors only of a single integer. 

The present paper is limited to a short summary in § 2 of the method 
for finding relations of the second species, to the presentation in § 4 of 
the seven simplest specimens, and to the proofs of these in § 5. The 
entire set of relations of the second species can be derived by applying 
the method as outlined to the formulas of the paper already cited (to 
appear elsewhere) in conjunction with the results of the paper quoted in § 2. 

2. Let N, (mz) denote the number of representations of as a sum 
of r squares with integer roots = 0, and N, (v,s) the number of repre- 
sentations of x as a sum of yr integer squares precisely s of which are odd 
with roots O and even with roots = 0. Then (4n,0) = (rn). 
N, (0,0) = 1, and by considering in all possible ways a sum of 7 squares 
as being composed of a sum of *—3 squares plus a sum of 3 squares, it 
is easily seen that 
4r(r—1) (7—2) 

s(s—1)(r—s) — 

s(s—1)(s—2) * 


N(n—1.r—3) = (r—1) (rv —2) DE(a—1) (n+3—4a, r—3). 


N; (n—1,8)= F(4a—2) 


Ne (n) = 12 E(n) +12 E(a—1) (n+ i—a). 


a 1 


in which, as always henceforth, the >’ without indicated limits refers to 


all a = 1, 2, 3,--- that make the arguments of the summands > 0. As 
usual F’(m) denotes the number of odd classes of binary quadratic forms 
of determinant —n, E(n) = F(n)—F,(n), where F,(n) is the number 


of even classes, and G(n) is the whole number of classes. The general 
formula of which the above five are special cases is discussed in the paper 
mentioned in § 1. There is no difficulty in proving the five independently 
from the well known theorems (7 


Nz; (8n—5) (8n—5,3) = 8F(8n—S), 
N; (4n—3) N; (4n—3, 1) 12 F'(4n—3). 
Nz (4n—2) = Nz (4n—2, 2) 12 F(4n—-2), 
Ns ( n--1) 1? E(n—1). 


448 E. T. BELL [October 


For y = 3, 5, 7, 9, 11, 13, 15 these formulas give the relations of 
the first species in § 1. 

There is a corresponding set of formulas, not involving the class number 
functions, obtained from the consideration of a sum of r squares as a sum 
of r—1 squares plus a single square in all possible ways. The results 
derived from this set are given in an earlier paper*. 

When the same N, or 8) is listed by both methods we equate 
the results and obtain the initial formula from which is derived a relation 
of the second species (§ 1). The procedure then is as follows. One 
member of this initial formula is (loc. cit., p. 170) a sum of the form 
>f((pn—qa*)/g], in which p, q, g are numerical constants, or it is the 
sum of a small number (not more than 4) such sums. If now it is possible 
to choose a constant ¢ such that each sum of this form reduces; when x 
is replaced by cn®, to the form > f[y(«*n?—?a*)], we can apply the 
method of Hurwitz+ and express each sum as a function of the real divisors 
alone of n. This process can be a to the numerous formulas obtained 
as outlined above from N,(n), N;(n,s) for r = 3, 5,7,9, 11, 13, and 
this apparently exhausts its scope. 

3. To state the simplest examples of the second species we shall require 
k(n) (the number of divisors of x of the form 4/—3 minus the number 
of divisors of the form 4k—1); ¢, (m) (the sum of the rth powers of all 
the divisors of n). As a henceforth n — 2“m,« => 0, m is odd 


and m = -[], = pp -- is the resolution of m (when m>1) into 
powers of distinct Ps 
The theorems of Stieltjes and Hur witzt can be restated in the form 


Nz (n?) = 6S(m), N;(n?) = 106,(2%) H(m) 
in which 


sim) = [] 
H(m) = (p*)—p ts (p?-), 


the product extending to p’, q’,---: S(1) H(1) = 


*American Journal of Mathematics, vol. 42 (1920), pp. 168-188. 

7 Comptes Rendus, vol. 98 (1884), pp. 504-507. 

} Stieltjes, Comptes Rendus, vol. 98 (1884), pp. 663-4; Hurwitz, loc. cit. It does not 
seem to have been noticed that Hurwitz’ theorem on N;(n?), L’Intermédiaire des Ma- 
thématiciens, vol. 14 (1907), p. 107, is equivalent to Stieltjes’ result of 1884; cf. Dickson, 
History of the Theory of Numbers, vol. 2, pp. ix, 271. vol. 3, pp. 134-5 


1924] CLASS NUMBER RELATIONS 449 
For reductions of formulas we have 


&(n) = E(m), = +1, 


and 
E(8n—5) 2F(8n—5), E(8n—1) 0, E(4n) E(n). 


3 


E(4n—3) = F(4n—3), E(4n—2) F(4n—2), F(4n) 2F(n). 


to all of which we shall refer as the elementary reductions. Applying 
these to Stieltjes’ theorem, and recalling that N;(m) = 12 F(n), we get 


F'(n*) §(m), F, +(2*—1) S(m), 


= 1(2%+1—1) S(m), E(n®) = 2S(n), 


for the last of which we shall have particular use later. 

From now on all formulas have been checked numerically. 

By the usual conventions a class equivalent to a(a#*+y*) is counted } 
in Fc. one equivalent to a(z2*+ay-+y?) counts for in F,, and 
F(0) = 0, E(0) = G(0) = =. 

4, The seven simplest relations of the second species are 


(I) 168 > E(n?—«a) ~ '—1) H(m)—21 S(m), 
(il) 8 > F'(m*®—4a) H(m)—S(m). 


> &(8a—7) F(4m2+7—8a) H(m), 
(IV) > s(8a—3) F(16n?+3—8a) = 8**!H(m), 
(VY) E(4a—3) F(4m?+6—8a) = H(m), 
(VI) > &(4a—3) F(16n?+6—8a) = 


2743 
(VII) 12 (—" —a) = H(m), 
in all of which, by the notation explained, n — 2“m, @ > 0, m odd. 


Let us first note how these are related among themselves by means of 
the elementary reductions. Evidently the left of (I) can be written 


168 > [§(a) E(n?—4a)+ &(4a—3)| E(n?+6—8a)+ E(n?+3—4a)}]. 


* There is a table for F(n), = 1 to 100, inthe Tohoku Mathematical Journal, 
vol. 9 (1921), p. 116. 


; 
} 


450 E. T. BELL 


Hence it follows readily that (VII) is implied by (I) with « = 0 and (II); 
(III) is implied by (I) with « = 1 and (V); (IV) implied by (1) with a = 2 
and (VI). 

5. Although it is sufficient to prove (I), (II), (V), (VI, it is as quick to 
prove (I)—(VI) together. From the formulas in § 2 with » = 5 we get 
the following, which can be easily verified: 


(1) - = 12 E(n) +48 > E(a—1)8(n+1—a), 
(2) N;(8n—7)= 20 F(8n—7)+80 > F(4a—3)&(2n—1—a), 
(3) N;(4n) : 12 E(n) +16 > [3 E(a—1) &(n+1—a) 


+5F(8a—5)§(4n+5 
(4) N;(8u—4)—= 12 E(2n—1)+ 16 > [3 E(a—1) §(2n—a) 

+5 F(8a—6)&(4n+1—4a)], 
(5b) N,(8n) = 12B(2n)+16> [3 E(a—1)&(2n+1—a) 

+5 F(8a—2)§(4n+1—4a)]. 


8a)], 


In (1) replace nx by n*, in (2) » by *(m*+7), in (3) n by n*, in (4) 
n by +(m*+1), in (5) x by 2n*, and to the results thus obtained apply 
the theorems of Stieltjes and Hurwitz. By obvious combinations of the 
resulting formulas we get 


(1.1) 24 > —1)&(n?+1—a) = 5¢,(2“) H(m)—3S(m), 
(2.2) 8 F(4a —a) = H(m)—S(n), 

(3.2) s > F(8a—5d) (4n?+5—8a) = [70,(2%)+1]AH(m), 
(4.2) > F(8a—6) & (2m?+3—4a) = H(m), 

(5.2) 8 > F(8a—2) &(8n*+1—4a) 1] H(m). 


Reversal of the order of summation and reduction of the right hand 
members give at once (I1)—(VI). the pair (11), (IV) coming from (3.2). 
UNIVERSITY OF WASHINGTON, 
SEATTLE, WasuH. 


| 


A NEW METHOD IN THE EQUIVALENCE OF PAIRS 
OF BILINEAR FORMS* 


BY 


R. G. D. RICHARDSON 


INTRODUCTION 

In attempting to clear the ground for an attack on the problem of re- 
lative maxima and minima of two quadratic or hermitian forms with an 
infinite number of variables, the author found that various questions in- 
sistently arose concerning the equivalence of pairs of forms in a finite number 
of variables and their reduction to normal type. The methods of attack 
in this theory as set forth in the literature seem to lack systematic unity. 
Not only are different devices employed for the treatment of the various 
phases of the problem, but these devices vary for different cases of the 
same phase. In investigating the problem of maxima and minima the author 
uncovered fundamental principles which may serve as a new interpretation 
of the problem of equivalence and make possible a new and more unified 
conception of the treatment. This new foundation for the theory of bi- 
linear forms is set forth in this memoir while the problem of relative extrema 
is developed in the succeeding one.7 

To exhibit the formal connection between the problems of relative extrema 
and of equivalence of pairs of forms, we note that in considering the bi 
linear forms 


1,” 1,” 
A(x, y) = p> Yj B(x, y) = 


and their equivalence under linear transformations, it is customary for 
convenience to study the pencil of forms 


(1) A(x, y)—4AB(a, y) 


with which is intimately connected the matrix |ajj;—4):;,. This intro- 
duction of a parameter 4 is on the other hand a familiar device in the 
problem of relative extrema. For example, on setting up the problem 


* Presented to the Society, December 29, 1920. 

f Relative extrema of pairs of quadratic and hermitian forms, these Transactions, 
vol. 26, pp. 479-494. This will be referred to as II. 
451 


45? R. G. D. RICHARDSON [October 


(2) A(ar, ©) minimum, B(z, 7) = 1, 
‘we are led to a consideration of the minimum of 
(3) A(x, x)—AB(a, x). 


In the consideration of either problem the linear homogeneous equations 
obtained by equating to zero the partial derivatives of (3) with respect 
to x, (or of (1) with respect to y,), 


(4) (51 — Abi) a, + = 0 (2 = 1,---, 2), 
are of fundamental importance. In addition to these, the equations 
(4’) = 0 (7 = 1,---, 2), 


obtained by differentiating (1) with respect to a;, often come into con- 
sideration. The homogeneous equations (4) have a solution when and only 
when 2 is a characteristic number, that is, one of the 2 zeros 4, ---, 4, of 
the determinant |aj—Abj . On setting 2 equal to 4, denoting the corre- 
sponding solutions of (4) by 

(k 

(a, ---, of), 
multiplying the equations by «,.--, 7 respectively and adding, there 
results the fundamental relation 


(5) A(X®, X¥®)—A, B(X®, X®) — 0, 


which may be compared with (1) and (3). 

Any finite solution of the minimum problem (2) will be among the 
possible solutions of (4). At least in the case where the coefficients A(z, x) 
and B(w, x) are real and the latter form is positive definite, a minimum 
exists and we note from (5) that the solution 


"(1) 41) 
A 


corresponding to the smallest characteristic number 4, and normalized to 
make B(x, x2) = 1, furnishes to A the minimum value 4,. 

When B is definite, the other n—1 solutions of (4), X¥°,---, X™, 
also have interpretations as minima, as we shall briefly sketch. Adding 
to the minimum problem (2) the orthogonal condition 


1924] EQUIVALENCE OF FORMS 453 
(6) B(X®, x) = b,x a, = 0, 


leads in formal fashion (II, § 3) to a consideration of the minimum of the 
expression 


A(a, 7) —AB(a, 2)—2uB(X®, 


which, on differentiation with regard to 2;, gives the non-homogeneous 
linear equations 


(7) D(a, —Aby)x, = +---+b, 2] 
Jj 


in the variables x,, 72.---..“,. It turns out in the case where B is definite 
(and even in a more general case) that w is 0 and that the solution of 
the extended minimum problem is thus also a solution of (4). which on 
account of the relation (6) cannot be X® and hence from (5) must be 


corresponding to the second smallest characteristic number 4,, and furnishing 
to A(x, x) the minimum value 
One may seek the minima under the successively added conditions 


(8) B(X®, 2) = 0, ---, B(X*-», 2) = 0, 


where X),..-, X¥-” are the solutions of the successive problems. For- 
mally this leads to a consideration of the minimum of 


A(x, —2 B(X®, —Qyn_-1 B(X-”, zx), 
and hence to the linear equations 


' 


and (6) and (8) together with the quadratic relation B(x,z)= 1. Here 
again it turns out that when B is definite (and even in a more general 
case), all the w’s are zero and the solution of the minimum problem is the 
solution X” of the equation (4), corresponding to the largest characteristic 
number 4, which on account of (5) gives A(z,x) the value 4,. 

On the other hand the problem of determining equivalence of pairs of 
forms is closely associated with that of finding linear transformations 


”(2) (2) 


454 R. G. D. RICHARDSON [October 


of the 2% variables x and y into new variables «’, y', such that the bilinear 
forms A and B in the new variables will have simple shapes. The 
coefficients of the new variables in the form A(z, y) will be compounded 
from the a’s, §’s, y’s, the coefficient a,, of ay, being 


as may readily be seen by actual substitution. In the same manner the 
bilinear form B(a,y) is transformed by (10) into }’b,, 2; y;, where 


hy 


— Sh lk 
tj 


The transform of A(.r,y)—AB(a2,y) is of course 


y 


The matrix notation lends itself very readily to such linear transtorm- 
ations of bilinear forms, being compact and having a simple arithmetic. Let 


Nn 
X ae 8 
ein 2)... 
Sn 


and denote by |X the transposed of |X|, (rows and columns interchanged). 
Then the product | -| is 


(1) » (1) <1) » (1) (n) | 
> a, Ab, & q | 


(11) 


(1) »(n) —_ 2 (1) y(n)... > p(n) J) p(n) | 


the terms in the /th column and /th row of the matrices || aj — Abj 
and (11) representing the coefficients of x,y, and x; y; respectively. 
The matrix (11) may be written in the form 


= 
A (g®, —AB(EM, 9”) A(&™, AB(E™, q™) 


| 
| 
| 


1924] EQUIVALENCE OF FORMS 455 
The problem of reduction of a pair of bilinear forms is then so to choose 
X|, | Y' that this matrix (11) (or (11')) is of simplest possible shape. 

Since the coefficients of the new bilinear forms are values of A(§, 7), 

B(&, 4), the £, » must be chosen as sets of numbers such that the trans- 

formation (10) will reduce 4A and / to particularly simple forms; to zeros 

so far as possible, and the remaining ones to unity so far as possible. We 
shall find that the solutions of (4) and (4’) are particularly useful in this 
connection, since A(§&, 4) and B(§, 4) are both zeros when §, 7 are solutions 
of (4) and (4’) corresponding to distinct values of 4. It turns out that in 

the regular case, which includes that when B(2, x) is definite, there are n 

linearly independent solutions each of equations (4) and (4’), and that if the 

rows of the matrices Y and ) are chosen as such solutions properly normal- 
ized and orthogonalized, the new bilinear forms take on the simplest of all 
shapes, 

y) Yns A'(a, = Ay An Xn Yn 


We note further that the 2 solutions of (4) chosen for the coefficients of 
the transformations are the solutions of the corresponding minimum problems. 

The discussion (§ 5) of the solutions of (7) and (9), which are here forced 
on our attention, furnishes an interesting chapter in the solution of linear 
algebraic equations. The situation may be conceived of somewhat as 
follows.* When in considering the solutions of (4) it is found that there 
are multiple roots of the determinant | aj; —Abj|, it is possible to alter 
the elements so as to make all the roots distinct. On passing to the 
limit so that m of the %’s become equal, it is fairly obvious that there 
are various possibilities. Solutions of the linear equations (4) corresponding 
to the infinitesimally different 2's may all approach the same limiting 
solution, in which case there is only one linearly independent solution 
corresponding to this m-valued 2 (we shall say+ 4-multiplicity — m, in- 
dex = 1, solution-multiplicity — m); they may all approach different 
limiting solutions, in which case there are m linearly independent solutions 
(A-multiplicity = m; index = m, each solution-multiplicity = 1). Be- 
tween these two extremes a variety of cases arise. If the number of 
linearly independent solutions of the «’s is p there will be m,, m,---, 


m,-tuple solutions respectively (4-multiplicity — m; index = p; solution- 
multiplicity m,, mp; > = m). These notions have been 


used by others{ in proving some of the fundamental theorems of matrices. 


* Cf. examples in footnote § 5. 

+ Sylvester used the terms nullity and vacuity; the terms here chosen are more in line 
with those used in the theories of differential and integral equations. 

+ For example, Taber, On the theory of matrices, American Journal of Mathe- 
matics, vol. 12, p. 3. 


| 
| 
| 
| 
| 


456 R. G. D. RICHARDSON [October 


We have here gone further and set up sets of related equations, called 
derived equations and similar to (7) and (9), which give solutions to take 
the place of those lost by the coalescing of distinct ones. The theory 
of derived equations enables us to compute each solution of (4) and its 
multiplicity m;. No details are here presented except those which are 
pertinent to the problem in hand. The relations of the theories of derived 
equations and of elementary divisors would furnish material for another 
memoir. 

In the irregular case where the solution-multiplicities are not all unity, 
the problem of reduction of a pair of bilinear forms is thus more com- 
plex than in the regular case. There will be solutions of (4) to fill 
up s only of the rows of the « and y matrices, but as indicated above 
there exist n—s solutions of derived equations also and these are used 
to fill up the gaps. On multiplication by matrices thus formed, (11) 
takes on the normal types which are well known in the theory of bilinear 
forms (§ 6). 

The procedure in the reduction of pairs of bilinear forms is thus made 
direct and a simple method of calculation made possible. By this point 
of view not only is the need made apparent for distinguishing between 
the cases where the normal types are different, but many related questions 
are answered. On the other hand, the reductions as worked out by 
Weierstrass, Kronecker, Darboux and others are not only complex in theory 
and difficult in technique but they involve the introduction of methods 
which cannot be classed strictly as algebraic. 

The criterion for equivalence of bilinear forms developed in § 6 must 
naturally be equivalent to that ordinarily expressed in terms of elementary 
divisors. The new criterion, however, appears essentially simpler in its 
fundamental conception. It depends only on the notions of the solutions 
of linear homogeneous and non-homogeneous equations. The necessary 
and sufficient condition that two pairs of bilinear forms be equivalent is 
that the 4-polynomials be equivalent and that for any root 4 each solution- 
multiplicity for one set of equations be matched with an equal one for 
the other. 

As an example of the straightforward treatment possible under this new 
theory, an application is made in §7 to the reduction to normal type of 
a single quadratic form, the transformation used being orthogonal. 

While this memoir concerns itself only with those families of bilinear 
forms which are not identically singular in 4, the methods and results 
may be extended to singular families. In that case the equations (4) or (5) 
or both will have solutions which involve 4 identically. In order com- 
pletely to characterize the type, certain constants which occur in these 


| 
| 
| 


1924] EQUIVALENCE OF FORMS 457 


solutions must be added to the set of invariants made up, as in the non- 
singular case, of the 4-multiplicities. Again in this singular case the methods 
of reduction to a normal type consist solely of the straightforward solution 
of linear equations. 

After building up in $$ 2-3 the theory of the linear equations associated 
with a pair of bilinear forms for the regular case and in § 5 extending 
it to the derived equations for the irregular case, application is made in 
$$ 4, 6 to the problem of reduction to normal type of pairs of bilinear 
forms and in §7 to the special cases of hermitian and quadratic forms. 


1, A PAIR OF MATRICES WITH THEIR ASSOCIATED BILINEAR FORMS AND SETS 
OF LINEAR EQUATIONS 


This paper deals with two bilinear forms 


1,” 1,» 
‘J 


The two sets each of x? complex numbers a;;, 4;;, which occur in the co- 
efficients of these forms may be conceived of as a pair of matrices A, B 

Let it be assumed that B! is non-singular;* in other words, that the 
determinant );;| is different from zero. The two sets each of » homo- 


geneous equations 


(13) D> (uy ~Abij) x; 0, 
j 
(14) > (aij Abiz) yi = VQ, 


having as coefficients respectively the rows and columns of the 4-matrix 


Ani —Albyy Ann 


are of fundamental importance. In order that either set have a solution, 
it is necessary and sufficient that 2 be a characteristic number; that is, 
a zero of the determinant a;—4/,;| formed from the coefficients. Since 


*The case where the matrix ||A—AB|| is singular, that is where the determinant 
vanishes identically in A, is excluded from the discussion. It is shown in the texts that 
for the non-singular case it is always possible by a simple transformation to take for 
a basis a pair of matrices one of which is non-singular; hence we lose nothing here in 
generality by assuming that || B|| is non-singular. 


| 
| 
| 


458 R. G. D. RICHARDSON [October 


the determinants aj, bj, are respectively the constant term and the 
coefficient of 2” in the expansion of this determinant as a polynomial 
in A, there are n zeros 4,,---, 4, of the determinant, and if A is non- 
singular, none of these vanish. Some or all of the 4-zeros may be equal. 

For a simple zero 4,, the determinant |aj—4b,| will be of rank n—1 
and hence there will correspond one linearly independent solution 


(af, 


of the equations (13) (and one 


of the equations (14)). For a q-fold zero 4, the determinant will have 
rank »—v*7, where ry may have any value from 1 to gq; and there will 
correspond r linearly independent solutions of (13). The linearly independent 
solutions of (13) may be designated primary solutions. The simplest case, 
where all the /-roots are distinct, will be treated in § 2. The next 
simplest case, where some of the 4-roots are multiple but where to such 
a root there correspond as many linearly independent solutions X as the 
multiplicity (r = q), is treated in § 3. In both these cases the total 
number of primary solutions is x. The most difficult case, where the 
number of linearly independent solutions corresponding to any multiple 
A-root is less than the multiplicity, is treated in § 5. 


2. SETS OF LINEAR EQUATIONS. FIRST REGULAR CASE 


In this section we shall discuss the simplest problem connected with 
the sets of linear equations (13) and (14), viz. when all the zeros of the 
A-determinant are distinct. The » primary solutions of (13) and (14) may 
be designated respectively as follows: 


(16) (2, 2) (2 = 1,---,m), 


(16’) = ---, y) (4 = 1, -++, #). 


If we consider the characteristic number 4, and multiply equations (13) 
by any set of numbers Y = (%,---, yn) respectively and add, we get 


(17) A(X®, Y) = & B(X™, Y). 


On the other hand by considering the characteristic number 4, and 
multiplying the equations (14) by any set of numbers X = (2, ---, Zn) 


| 
| 
| 


1924] EQUIVALENCE OF FORMS 459 
respectively and adding, we obtain 
(18) A(X, Y®) Ay BCX, 


When we use for multipliers Y and X some solutions of (14) and (13) 
respectively, the relations (17) and (18) reduce to 


(19) Y®) Ay, B(X®™, Y"), Ay B(X®, Y®), 
which, when no ambiguity can result, may be written in the form 
(19’) = Blk, hk’), = Blk. I’). 


When the characteristic numbers are different (4; + 4.) there follow 
from (19) the two fundamental relations 


(20) A(k, k’) = 0, B(k, k’) 0, 


which will be designated as the orthogonal relations for X® and Y with 
regard to the matrices ||aj|| and | by|| respectively. Hence follows 

THEOREM I. The solutions X™ and Y“ are orthogonal provided they 
correspond to unequal characteristic numbers. 

As will be shown in Theorem IV for a more general case, the solutions (16) 
(and also (16’)) are linearly independent. 

Given the solutions X¥”, Y corresponding to the same characteristic 
number 4;, we shall say that ¥”, 1 are normalized provided that one 
or both solutions are multiplied by constants such as will make B(k, k) 
equal to unity. 

THEOREM II. Any solutions X™, Y™ which correspond to a simple character- 
istic number can be normalized. 

To prove this we note from Theorem I that when k + k’, B(k, k’) = 0. 
Were B(k, k) also equal to zero, it would follow as in the Introduction 
(Formula 11) that the product of three matrices J"), || X|! of rank 
would be a matrix with every element in the Ath row zero; that is, three 
non-vanishing determinants would have a vanishing product. Hence B(k, k) 
must be different from zero, and on multiplying X” or Y, or both, by 
proper constants it may be made equal to unity, and these solutions will 
from now on be assumed normalized. 

Since on setting k’ = k in (19’) we have 


(21) A(k,k) = 4; Blk, k), 


it follows that when the solutions are normalized, A(/. /:) has the value 4,. 


| 


460 R. G. D. RICHARDSON [October 


Incidentally the following theorem has been established: 

THEOREM IT]. For solutions (16) and (16’), it is impossible that B(k, k’) 
vanish for all values of k' when k is fixed or for all values of k when k’ 
is fixed. 

The advantages of using the solutions (16) and (16’) as the coefficients 
of the linear transformations to be used in the reduction of the bilinear 
forms is at once evident if we apply Theorems I and IJ to the matrix (11). 
We note that the orthogonal properties make A(k, k’), B(k. k’) both zero 
when / + k’, and that on account of normalization the remaining terms (in 
the main diagonal) may be written 4;— 4. This is stated in the form of 
a theorem in § 4. 


3. SETS OF LINEAR EQUATIONS. SECOND REGULAR CASE 

When 4 is a multiple root of the determinant |a;—4b,|, the number 
of corresponding linearly independent solutions (16) of (13) may be equal 
to or less than the multiplicity. The former case is much simpler and will 
be treated in this section. If by a slight change of one or more of the 
coefficients aj, by the multiple root of the determinant is broken up into 
distinct ones slightly separated, the solutions X, Y of (13) and (14) will be 
only slightly altered. These altered solutions are subject to the orthogonal 
and normal properties of Theorems I and II, not only with other solutions 
but among themselves. It is natural to suppose, on passing back to the 
multiple 2, that when these solutions XY, Y have distinct limits the limits 
will satisfy the same conditions. That this is true and that this second 
case is thus intimately related to the first, we shall now proceed to 
demonstrate. 

Concerning the solutions of the equations (13), which may be written 
in the form 


it is possible to prove the following 

THEOREM IV. The n solutions (16) of the equation (13) are linearly 
independent provided that those corresponding to a multiple characteristic 
number 4 are chosen so as to be linearly independent among themselves. 


For if not, let us consider that the matrix of the 2’s in (16) be of 


rank »—m. Then there exist constants ¢,.---.¢, such that 


(22) +---+e, 2) = 0 (j = 1,---,n), 


where m of the c’s may be assigned at random (except that not all are 
zero). On the other hand let us pick out any one of the equations (13) 


| 


1924] EQUIVALENCE OF FORMS 461 


(say the ith), give to 4 each of the characteristic values in succession, 
and consider the corresponding solutions. If the equations are multiplied 
respectively by «,c2.--+.¢, and added, the sum of each column on the 
left vanishes, and we have the resulting equation 

Giving to 7 its values, these are m equations with coefficients )j, the 
determinant of which is by hypothesis different from zero. It follows then 
that the corresponding variables are zero: 


(24) ¢ 0 (j 


Since the rank of the matrix of the x’ is assumed to be n—m, the 
values of m of the 4;c; may be assigned at random (except that not all 
are zero), and the values of the remaining »—m can then be expressed 
linearly in terms of those assigned, the coefficients being formed from certain 
minors of the z’s. But by hypothesis the c’s satisfy the equations (22) 
and by the same processes »-—m of these can be expressed linearly in 
terms of m assigned arbitrarily. The coefficients of these linear relations 
are compounded from the «’s in the same way as they are in the case of 
the linear relations obtained from (23). Hence the two sets of coefficients 
are equal, and we have 


4 (i) ' @ 5 
hii = kn—m An—m-:-1 (n—m4-1 + kn Antn (4 = 1,---.m—m), 


a= 
where ¢ny—m+i,-‘--,¢n are arbitrary: for example, they may all be zero 
except one, and this one may be successively ¢n—m+i,---,¢n. But this is 
only possible when 4, = 4, = --- == 4d,, that is when 4 is an n-tuple root. 
But even this case is ruled out, for we assumed that those of the solutions (16) 
corresponding to the same characteristic number were chosen linearly 
independent. Hence a contradiction is established and the solutions are 
linearly independent. 

It is possible to proceed further and to show that the solutions corre- 
sponding to the multiple characteristic numbers may be so chosen that 
the n solutions (16) and (16’) are normalized and mutually orthogonal. 

As a typical case, let us ussume that there are three characteristic 
numbers 4,, all equal, and no others, and show how to set up three 
normalized linearly independent solutions having among themselves the 


- 
| 
| 
] 
| 


462 R. G. D. RICHARDSON {October 


orthogonal property. In the proof the linearly independent Y®, Y°, Y® 
will be taken at random and the X’s adapted to fulfil the conditions, but 
other methods may be used (ef.§ 7). Denoting by ¥%, ¥%, X% three 
linearly independent solutions corresponding to the value of 2 under con- 
sideration, we must first show that there exist constants «,, @, @;, such 
that the solution = a, + a,X™ satisfies the relations 
= 1, = 0, B(1,3) = 0. The bilinear relation is linear in 
each of the variables and these equations may be written 


«, B(k,, 1) + B(ke, 1)-+ 1) = 1, 
(25) «, 2)+ a, Blke, 2)+ Blks, 2) = 
B(k,, 3) + es 3)-+ cs B( hes, = @. 


The determinant of the coefficients B is the product of three determinants, 
as we note from a formula analogous to (11). Since by hypothesis all 
three of these determinants |X|, | ¥| and |by| are non-singular, the deter- 
minant of the B’s is different from zero and the constants @,, @,, @, may 
be determined to satisfy (25). 

In similar fashion constants #,, 42, 8; may be determined such that 


= B, Xk Bs Bs Xs) 


satisfies the relations B(2,1) = 0. B(2,2) = 1, B(2,3) = 0, and 
constants 7;. 7; such that 


satisfies the relations B(3,1) — 0, B(3,.2) = 0, B(3, 3) 1. That 
X®, X®, X® are linearly independent is easily shown. 

The method of treatment used in this special case is of general application. 
If besides 4,, for which the index and solution-multiplicity are assumed to 
be q, there are other characteristic numbers 42, 4;,--- the number of 
equations in the q variables corresponding to (25) will be q, the other 
similar relations being satisfied identically, as may be seen from Theorem I]. 
The determinant of the B’s cannot be zero, for it is a minor of an n-rowed 
determinant which is the product of three determinants whose values are 
different from zero; and further all terms of this »-rowed determinant in 
the first p rows and p columns except this minor composed of the B’s are 
zero. This completes the proof of the following 

THEOREM V. When cach solution-multiplicity of the solutions of (13) ts 
unity, it is possible to choose the linearly independent solutions (16) and (16') 


1924] EQUIVALENCE OF FORMS 463 


so that they are normalized with respect to the matrix bj and mutually 
orthogonal with respect to hoth and 
We may note further that Theorem III] is valid for this case also. 


4, SIMULTANEOUS REDUCTION OF TWO MATRICES. 
FIRST AND SECOND REGULAR CASES 

We are now in a position to discuss the simplest case of the simultaneous 
reduction of two matrices |A)) and |B) to normal forms. Since the first 
and second cases treated in $$ 2, 3 are essentially similar they can here 
be handled together. The number of primary solutions each of (13) and (14) 
is in both cases ». 

In reducing simultaneously the matrices A , |B | the 4-matrix | A— 4B 
compounded from them may be considered as playing a fundamental role. 
Denoting the transposed matrix of the sets of solutions (16) by X and 
the matrix of the sets (16’) by J} . and effecting the matrix multiplication 


we have by Theorems I, I] and V and formula (11) this simplest of all 
normal .forms. 


L—i 


As was noted in the Introduction, this is equivalent to the following 

THEOREM VI. By means of linear transformations of the °s and y's 
whose coefficients are the columns of (16) and (16') which are respectively 
the solutions of the equations (13) and (14) associated with the bilinear 
Jorms A(x, y), Bla, y), these forms are reduced to 


Blz,y) = tony. 


A(x, y) = + + 


5. SETS OF LINEAR DERIVED EQUATIONS. IRREGULAR CASE 


The simple cases treated in the previous sections may be considered as 
regular. There are irregular cases branching off at several points in the 
treatment. Before undertaking a study of these irregular cases let us 
discuss a possible conception of their relation to the regular case. 


32 


O 
0 


464 R. G. D. RICHARDSON [October 


In the irregular case there will be for some 4 a solution X of the 
equations (13) which may be regarded as multiple. By a change of one 
of the coefficients aj the root of the determinant |aj—4by can be broken 
up into two or more. Denoting by Xf’, X two solutions of (13) corre- 
sponding to infinitesimally different 4s and by YY’, YS those of (14), we 
have from Theorems I, II 


B(X®, Y2) = 0, B(X®, Y®) = 1, 
B(X2,Y2)=0, B(X2, = 1. 


Let us now pass back to the limiting case where 4 is a multiple root of 
the determinant. When the limiting solutions X®, Y® are linearly in- 
dependent, as they were in the regular case, discussed in § 3, the same 
relations held in the limit: 


B(X®, Y®) = 0, B(X®, Y) = 1, 
B(X®?, Y) = 0, B(X®, Y®) = 1. 


But when in the limit these solutions X coalesce, these four relations must 
reduce to a single one, and this turns out to be B(Y, Y) = 0. The 
limit of such expressions as B(X™, Y®) must be zero while in the case 
of B(X™, Y®) the constant multipliers needed to make it unity have 
become infinite. 

There are not enough linearly independent solutions to make sets such 
as (16) or (16’), and the previous method of procedure in the reduction of 
pairs of forms breaks down. The difference of two solutions Xf, X© is 
suggested as a substitute for one of them and then the derivative with 
regard to 4 is naturally considered. We shall find that the introduction of 
the solutions of derived equations in place of those missing from the set 
of primary solutions furnishes the way out of the difficulty. 

As we have noted in § 1, for a q-fold zero 4; the A-determinant will 
have rank »—vr where ry may have any value from 1 to q and there are 
r linearly independent solutions of (13). The numbers q and r which are 
both of fundamental significance may be designated respectively the 4-multi- 
plicity and the index of 2 and the yr linearly independent solutions of (13) 
have been designated primary solutions. When ¢ is less than q, as it is 
in the irregular case discussed in this section, corresponding to one or 
more of these equal 2’s there must be solutions which may be conceived 
of as multiple solutions of the equations (13). By an alteration of one 
of the coefficients aj the q zeros of the A-determinant become distinct 


1924] EQUIVALENCE OF FORMS 465 


and the q solutions X will also be distinct. On passing back continuously 
to the original aj, each of the q distinct solutions Y must have a limit. 
When the number 7 of the limiting solutions is less than g, some of them 
must be multiple solutions. The numbers of solutions Y which have united 
to form one of the limiting solutions may be m,, to form another m,, and 
to form the rth it may be m,. These numbers m,,---, m, may be called 
the solution-multiplicities of the solutions of the equations in the z’s corre- 
sponding to this 4; and are subject to the relation m, + m,+---+m;= 4. 
Since multiple solutions of equations or sets of equations are often also 
solutions of the corresponding equations formed by taking the derivatives, 
it is natural to investigate if something of that sort takes place in this 
problem. 

Let us now take up the detailed discussion of the irregular case of this 
section. For some characteristic number 4; the 4-index is less than the 
A-multiplicity, and hence not all the solution-multiplicities are unity. By 
altering one of the coefficients aj, two roots of the characteristic equation 
will be made unequal. Calling them 4, and 4,-++ A4, the corresponding 
solutions of (13) and 2, ---. 2, respectively, and 
the new a’s, ajj, and subtracting the two sets of equations we have 


at —A, bi) Ax = AAD [by (2; Axj 
J 


while on dividing by AZ and passing to the limit the relations 


(26) Daj — A, by) = bya; 


J 


result. This set of non-homogeneous equations* may be obtained formally 
by taking the derivative with regard to 2 of the set (13), but for our 
purposes they may be best written in the form 


» a 1) 
(26’) a (ay — 4, by) = a bi x} 
J J 
where 2‘? are the constants which are the primary solution of (13) for 4,. 
* That the equations (26) have a solution may also be seen from another standpoint as 
follows. It will be necessary only to show that if the matrix ||ay—Aby|| be augmented 
by the addition of another column composed of the terms on the right hand side, the rank 
is not greater than »—1. Now, if the rows are multiplied by y”,---, y) and each 
column is added, the result will be zero. For the first x columns this will be true because 
the y's are defined to be the solution of such equations and for the last column because 
of the relation B(X, Y™) = 0 which we have noted above. 


$2* 


— 
= 


466 R. G. D. RICHARDSON [October 


If 4, is a triple root, similar procedure shows that the second derivatives 
of X satisfy the non-homogeneous equations 


(27) > b;;) —2 
where da;/dA are the constants which represent the solution obtained from (26). 
And if 2, is a p-tuple root there are solutions X, d X/d4, ---, 
the last satisfying the non-homogeneous equations 


dP 1 x; | dP Aj | 


These equations (26—-28) may be designated derived equations and the 
solutions will be called solutions* of the derived equations. The non- 
homogeneous equations (26-28) may for our purposes best be written 


* As examples consider each of four sets of linear equations 


(«) (y) 
= 0. = 0, (1—A) as = @, (1—A)as = 
(b—A)ae = 0. (1—A)ae = (1—A)ae+a; = O, (1—A)re+a, = 0. 
(c—A)a, = 0; (1—A)a, = 0: == @: (1—A)a, +a. = 0. 


On equating the determinant of the coefficients of (@) to zero, we have the characteristic 
numbers 2 = 1, b,c and corresponding to these are three linearly independent solutions 
of («) which may be written (1, 0,0), (0,1,0), (0,0,1). In (8) the characteristic numbers 
are equal but the solutions may be written down asin («). In fact (8) may be regarded 
as the limiting case of («¢) when 6 and ¢ approach 1. When 4 is set equal to 1 in 
equations («) the rank of the matrix is two while in () it is zero. 

In (7) and (é) the determinants again have 4 as a triple root, there being two linearly 
independent solutions, (1,0,0), (0, 1,0); and one, (1,0,0), respectively. If one coefficient 
be modified slightly the characteristic numbers will be separated and there will be three linearly 
independent solutions of the equations in x. On passing back to the original form, two 
solutions of (7) unite in one and the other remains distinct, while in (¢) they all unite in one. 

In all three sets of equations (), (7’), (0), n = 3, q = 3; while r is 3, 2,1, respectively. 


For (2), m: = mz = ms = 1; for (7), mi = 2, me = 1; for (0), m, = 3. In the case of 
the set (2) there will be two successive sets of derived equations. 
(¢) 
- (= 0). (1—A = {| —— = 
rt (= 0) ( ) (= 0+¢,0) 
dis ad? x. d? x5 dx, \' 
ad A) dR dk di 0+¢,1), 


where the primes on the right indicate that the solutions found in the next previous set 
of equations are substituted for these symbols. The augmented matrix in (¢) consists of 


| 
| 


1924] EQUIVALENCE OF FORMS 467 


(28’) = > ij 
J 
where the constants x; are the solution of the next preceding derived equation. 

The sum of the solution-multiplicities corresponding to 4, is v, as we 
have seen above; the total number for all 2’s is ~ and the sum of the 
A-multiplicities for the various /’s is also n. The total number of primary 
solutions (that is, solutions of (13) for all values of 4) is less than n in 
the case considered here; together with the solutions of the derived equations 
(26-28 or 28’) the total is m. The simplest case, arising when the total 
number of primary solutions is 7, that is, when there is no multiple solution 
of (13), has been called the regular case, and treated in §§ 2—4. 

The systems of equations (13) and (14) are called adjoint to one another. 
They have the same determinant, the same characteristic numbers, and 
the same solution-multiplicites. When there are m linearly independent 
solutions of one set, there are the same number of the other. Similar 
derived equations may be set up for the y's; corresponding to each 4 there 
will be the same number each of primary solutions and solutions of the 
derived equations as there are for the «’s. 

Corresponding to any 4 the solutions may be arranged in as many 
groups as the index indicates, and within each group the primary solution 
may be followed by the successive derived solutions. To every non-singular 
matrix | 4—AB| corresponds then a total of x primary solutions and 
solutions of the derived equations in .,, and » for the equations in y: the 
totality can for convenience be designated as before in (16) and (16’) by 
X®,..-, XM; Y@,..., Y™, As we shall see in § 6, these solutions when 
properly chosen with reference to orthogonalization and normalization 
furnish exactly the matrices || XY ||, || || needed to make the transformation 
of pairs of bilinear forms to canonical shape. 

It may be well here to interpolate some remarks concerning the number 
of fundamental solutions on which the primary solutions and the solutions 
of derived equations depend. 

Because it presents the simplest problem, the case where a 4-index is 
unity will be discussed first.* Let us assume further that the solution- 
multiplicity is three. Designating by = = (&{?,---. €%) a solution of 


four columns and three rows and for 4 = 1 is of rank 2. One solution is (0,1,0) and from 
the theory of linear equations we know that the general solution is (1,0, 0) (0,1. 0) 
= (1, ¢:, 0). For (¢) the augmented matrix is still of rank 2 while one solution is (0, 0, 1) 
and the general solution is (1,0, 0) + ¢, (0, 1,0) + c (0, 0,1) = (1, The one solution 
of the original set (0) has been augmented by two others and the solutions (1. 0, 0), (0,1, 0), 
(0,0, 1) of (2) and its derived equations play a fundamental réle in the discussion that follows. 
*In this connection, see last foot-note above. 


468 R. G. D. RICHARDSON [October 


the equations (13), the rank of the matrix of coefficients on the left of 
(26) is »—1 and since it is known that there are solutions of these non- 
homogeneous equations, this matrix augumented by a new column made 
up of the terms on the right (after the set of constants = is substituted 
for the ’s) cannot be greater than »—1. We may denote by =° a 
solution of the non-homogeneous equations (26) and note that by the 
addition of a constant multiplied by = this still remains a solution of 
(26). In fact this solution =©° +c is the most general possible. 

Passing now to (27) we note again that the matrix of the coefficients 
on the left is of rank (#—1) and that by substituting for the z’s on the 
right the solutions of (26) and the addition of the column to the already 
augumented matrix, the rank does not increase.* Since the solution sub- 
stituted in the right-hand side is linearly dependent on the two sets of 
constants =, =© the general solution of (27) is linearly dependent on 
three. One of these, as may readily be noted from the theory of linear 
equations, is =, Since one of the solutions substituted into the right 
hand side in place of the «’s is =, another of these must be =, the 
third is new and may be denoted by =. Since the same considerations 
apply to equations (28), we have the following 

THEOREM VII. If 4; has index unity, the solution of the (p—1)th derived 
equation is linearly dependent on p sets of constants. Of these p sets of con- 
stants p—1 are solutions also of the (p—2)th derived equation and one is new. 

When, however, the index of 4; is m >1 there are m linearly independent 
primary solutions of (13). If the number p—1 of solutions of derived 
equations corresponding to each of these is the same, the m primary solutions 
may at first be selected at random (except that they must be linearly in- 


*If the matrix || aj; —Abi;|| is extended to the right by the addition successively as 
columns of the right hand sides of (26), (27), etc., the resulting matrices have some 
interesting properties, Let us suppose, for example, that » — 3 and that there is only 
one primary solution. The normal type of the matrix || ai; —Abi;|\ will in this case be 
proved later to be (33), the determinant being divisible by (4: —A)* but being of rank 2. 
The four-column matrix obtained by augmenting it by the right hand side of (26) is, 
however, such that all three column determinants are divisible by (21—4A)? but not all by 
a higher power. The fifth column made up of the right hand side of (27) introduces deter- 
minants which are divisible by 4,—4 but not by a higher power, and this is true even 
when the fourth column is left out of consideration. The solutions of the corresponding 
equations in the x’s are all single-valued ones. The addition of a sixth column made up 
of the solutions of the second derived equations introduces determinants which do not 
vanish for 4 = A, even when the fourth and fifth columns are neglected. There is no 
solution of the corresponding equations. 

The solution of the equations (26) can be regarded as a double one. In fact each 
determinant containing the fourth column is intimately related to the derivative with 
respect to 4 of the determinant of the first three. 


| 
] 


1924] EQUIVALENCE OF FORMS 469 


dependent), the solutions of the derived equation corresponding to each 
one being then built up. Each of the m solutions of the (p—1)th derived 
equations will be linearly dependent on p sets of constants as in the case 
where the index is 1, and the total number of different sets of constants 
to be considered is mp. As will be shown later (Theorem XI) in order 
to satisfy the conditions the primary solutions must finally be chosen so 
that each one of any set composed of a primary solution and the solutions 
of the corresponding derived equations is orthogonal to all those of any 
other set. In fact the nature of the dependence of the primary solutions 
finally chosen on the ones selected at random is determined by making 
these sets mutually orthogonal. Solutions so chosen may be conceived of 
as the limit of separate solutions which have united and will be designated 
proper primary solutions. Orthogonality would naturally be expected between 
the various groups which form such limiting cases. 

When the index of 4; is m but the solutions have not all the same 
multiplicity, some adjustment of the linear dependence of the primary 
solutions on the set chosen first at random must again be made in order 
to insure that solutions of the successive derived equations be orthogonal. 
If one primary solution is to have a greater multiplicity than any of the 
others, it will necessarily be unique. This proper primary solution will 
then be entirely determined by the adjustments necessary to obtain solutions 
of the derived equations corresponding to it. On the other hand the multi- 
plicity of any solution is not lowered by adding to it a solution of greater 
multiplicity and this principle may be used in adjusting the orthogonalization 
of the former in making it a proper primary. 

Let us proceed now to develop the relations between the various sets 
of solutions X and Y analogous to those obtained in the regular case in 
§§ 2-3. 

THEOREM VIII. The totality n of solutions consisting of the primary 
solutions of (13) or of (14) together with the solutions of the derived equations 
are linearly independent. 

Following the method of proof of Theorem IV let us again assume the 
matrix composed of the X’s to be of rank n—m. There are then constants 
(14, Such that 


(29) +... + = 0, 


and of these constants m may be assigned at random. The equations 
(26’)-(28’) may for any particular 4, be written in the form 


, (k) — (k) (k—1 


| 


470 R. G. D. RICHARDSON [October 


where the 4 are » in number, one for each solution whether primary or 
of derived equations, and where «, == 0 for a primary solution and = 1 
for a solution of a derived equation. Proceeding now as in the proof 
referred to, we are led to the following equations analogous to (23): 


(k) 

kj 


From them we deduce analogous to (24) the relations 


A, + é 1)] (), 


We have here and in (29) two essentially different relations between the 
solutions unless all the 4’s are equal and all e’s zero. But in this event 
the index is » and each solution-multiplicity is unity and we have seen 
that the solutions can then be chosen linearly independent. We are thus 
in every case led to a contradiction and the theorem is established. 

To determine the relations between the various primary solutions and 
solutions of the derived equations, it is desirable to generalize formulas 
(20) and (21) to cover the new types of solutions. In the first place it 
may be observed that precisely these same formulas hold if ¢ and 7 both 
designate primary solutions corresponding to distinct 4’s. More generally 
we have by multiplication of the derived equations 


Sa. -A,b,.) ar! = 
J 

by and addition. 

(31) AU, k)—A BU. k) BUI—-1.k); 


and on multiplication of the derived equations 


> Sd. 

by «” and addition. 

(32) BU, k) BU, k—1). 


The terms on the right of (31) and (32) must be interpreted to be 
identically zero provided /, /: respectively designate primary solutions. 
THeorEM IX. Jf /, k correspond to distinct’ characteristic numbers, 
BU, k) 0. 
For, this being the case when / = 1 and k = m,-+ 1 designate primary 
solutions, it follows from (31) and (32) in a manner analogous to that 


1924] EQUIVALENCE OF FORMS 471 
used in the proof of Theorem ] that it is true for / 2, 3,---, ete.; 
and by another application of the same argument, that it is true for 
k = mp,p+2, m,+3, ---; and hence for all corresponding solutions of 


derived equations. 

It is next in order to discover the relations between the primary 
solutions and solutions of derived equations, corresponding to the same 
characteristic number 4,, whose index is unity and solution-multiplicity p. 
Whether or not there are other solutions, there is a group of »* terms 
Bil, k) (1, k = 1,---. p) among which certain equalities exist and certain of 
which will be zero. From (31) and (32) we note that B(/—1.4) = B(,k—1) 
and hence those terms in the group which occur in any one of the cross 


diagonals are equal. Further since 7(/, 0) 0, we have from this 
same equality 0 B(2. 0) B(1.1) and similarly B(2.1) O,--. 
-++. B(p—1,1) 0. On the other hand B(,.1) cannot be zero; for, 


since each solution in this group is orthogonal to those outside, it may 
be shown as in Theorem 2 that if this were the case, the solutions 
V®,..., VY would be linearly dependent, which is contrary to Theorem VIII. 
Hence B(p,1) may be set equal to 1 and from (31) it follows that 
A(p,1) == 4, B(p.1) = 4,. It is then possible to state the following 

THEOREM X. Jn the group of p* terms B(/,k) corresponding to a character- 
istic number A, of index 1 and solution-multiplicity p, terms in any cross 
diagonal are equal, all terms above the main cross diagonal are zero, and 
hy normalization each term in the main cross diagonal takes the form 
4,—A. 

In discussing the relations between the solutions corresponding to a 2 
whose index is greater than unity, let us, for the sake of definiteness. 
denote a set of primary solutions by X¥™, VY‘ and the solutions of their 


corresponding derived equations by YO™ and a 
second set of primary solutions by X°+t?, YO") and the corresponding 
solutions of the derived equations by YO" 


and let m,< ms. These primary sets when properly determined may be 
conceived of as respectively m, and mg-tuple solutions of the linear equations 
(13) for the same 4. 

From (31) and (32) we derive immediately 

THEOREM XI, Jf / is « value from one of the following sets and k from 
the other, 1.---, my; +2.--+, m-+m.. then BUI—1, k) k—1), 
provided that if 1 or k is 1 these values of the B’s are understood to be 
zero; and further B(I—1. m,+1) 0, B(m +1, k—1) 0. 

CoroLtLaRY. Jf these B(l.k) are written down im m, rows and mg 
columns, then all terms to the left and above the cross diagonal passing 
through the upper right hand corner are zero. 


472 R. G. D. RICHARDSON [October 


The formulas in Theorems IX—XI automatically hold whatever be the 
primary solutions (whether proper or not). That is, they are valid no 
matter what (linearly independent) solutions of (13) are chosen to start 
with. Proper primary solutions are, however, the limits of solutions which 
can be regarded as having been separated by a change of one of the 
coefficients and for such primary solutions and corresponding solutions of 
derived equations much more drastic conditions are valid. These relations 
will now be embodied in 

THEOREM XII. Jf 1 is one of the numbers from the two sets 1,---, m1; 
m +1, +--+, m+ me which correspond to two proper primary solutions and 
the solutions of the derived equations for the same characteristic number, 
and k one of the numbers from the other set, then 


Bil, k) = 0. 


A part of this result has already been formulated in the corollary above. 
As typical of the parts still remaining unproved, let us show that B(m 
+ mz—1,m,)—0. Since is an m,-tuple solution, B(m, + mz—1,1)—0 
is an m,-tuple relation in y,. Hence it may be shown by a method analogous 
to that used in the derivation of formulas (26-28) that 


B(m + m:.—1, 2) = 0, 


7(1) 
B(xm Ma—1) dy 


add 
B(m, +m—1,3) = 0, ---, B(m+m,—1, = 0. 


And in a similar way it may be shown that all the other B’s involving 
an X from one set and a Y from the other are zero. 


6. SIMULTANEOUS REDUCTION OF TWO MATRICES. IRREGULAR CASE 


Preparatory to a discussion of the general theory of the reduction of 
a pair of matrices in the irregular case, it is well to consider a couple 
of special examples. Let us first treat the case where there is but 
one characteristic number 4,, with multiplicity three. The problem is 
again that of determining matrices | X ||, || Y|| so that the product matrix 

Y || || 4+4B|| || X|| is of a normal type. For this purpose it is necessary 
to use the primary solutions X¥®, Y® of the two sets each of three linear 
homogeneous equations which correspond to the matrix || A—AB ||, to- 
gether with the solutions of the corresponding derived equations X°, X®; 
Y®, Y®, It is proposed to show that by such a multiplication the matrix 
may be reduced* to 


+ Cf. second footnote § 5. 


1924] EQUIVALENCE OF FORMS 473 


0 Q 
(33) 


A,—A | 0 


In the first place it may be noted from Theorem X that terms in and above 
the main cross diagonals are as given in (33) and further that it is necessary 
to discuss only B(2, 3) and 6(3, 3). If it can be shown that by adjustment 
of the solutions of the derived equations these two may be equated to zero, 
it follows that the product matrix has the desired form. For, by (31) it 
would follow that A(2, 3) B(1, 3) 1 and A(3, 3) A, B(3.3) 

Denoting by X = the primary solution and by =@ and =® the 
solutions of the derived equations, as discussed in § 5, the problem may 
be put in the form of determining constants « and 8 such that 


satisfy the relations B(2,3) = 0, B(3,3) = O or in other words 
B(E°, Y)+e¢B(s%, Y) 0, B(Z®, 0, 
and since, from Theorem X. 

B(z®, = Y®) 


this is readily done. It may be observed that this procedure changes the 
values of none of the terms in the main cross diagonal or above and hence 
the result is established.” 


*From the relation 


1 0 0 | O 0 | 0 0 1 
| 
— Ae Ay — Ay 
1 1 1 1 


we note that a normal form such as that obtained in Theorem VI may be transformed into 
one similar to (33) where the 4-roots are distinct. The new shape might be taken as 


0 
0 Ag A 1 
|| 


474 R. G. D. RICHARDSON [October 


If there were other characteristic numbers each with the index one, 
precisely the same procedure would reduce the corresponding square block 
of matrix to a form similar to (33), each of which blocks in succession 
would be placed below and to the right of the former ones, while by 
Theorem XII all the remaining blocks of terms consist of zeros throughout. 

The method of procedure is, however. somewhat different when the index 
of a 4 is greater than unity and less than the 4-multiplicity. As a typical 
example let us consider the case where index is equal to 2 and /-multi- 
plicity to 5. One of the corresponding solution- multiplicities must then 
be four or three; which of these it is can be determined by setting up 
the derived equations and solving. If the solution is a three-fold only, 
the third derived equations will be found to have no possible solution. 
Since the case of a three-fold solution and an accompanying two-fold 
presents more difficulty, let us discuss it here. We shall show that the 
matrices |X') and || Y | of primary solutions and solutions of derived 
equations can be so chosen that the product takes the form 


0 4,--4 0 
1 0 | 
(34) | 0 0 0 


The groups of 2 by 3 terms in the upper right and lower left corners 
will be zero when the primary solutions are properly chosen (Theorem XII). 
In each of the other groups the terms in and above the main cross diagonal 
will be after normalization as they appear in (34). By adding a constant 
times the primary solution to each of the corresponding solutions of the 
derived equations the other terms may be adjusted as in the first problem 


normal for the regular case, but it is neither as elegant nor so simple to handle as that 
chosen. However, on passing to the limit and making the roots equal, this becomes the 
normal shape for this irregular case and is from this standpoint of considerable interest. 

It is worthy of note that the multiplying matrices which correspond to these trans- 
formations of the x's and y’s can be regarded as solutions of linear equations connected 
with the original form. These equations bear a marked resemblance to the primary and 
derived eyuations necessary to obtain the matrices ||X |! and || Y') used in the reduction 
to the shape (33). For example, to obtain the second row of the first matrix the equations 
have the form 


(ad, A, b. dy? : > b, 


where the y‘ on the right are constants which are solutions of the primary equations 


for A Ae. 


0 0 0 4,—A 
0 0 0 &4,—Aé 1 


1924] EQUIVALENCE OF FORMS AT) 


discussed above. Such adjustment does not alter in any way the terms 
of (34) already determined. The reduction to the form (34) is then 
completed. 

We may now formulate the following 

THEOREM XIII. By means of linear transformations of the as and y's 
whose coefficients are respectively the n® constants made up of: (1) Properly 
chosen primary solutions of the equations (13) and (14) associated with the 
bilinear forms A(x, y), B(x, y); and (2) the corresponding solutions of the 
derived equations, reduction takes place as follows: 


m—1 
y) (> Ay Lm--i-1 i Em —i--1 
1 


1 
Me 
1 
n n—1 
\ 
+ — —m ya L2n—m,—i+1Yi-1 
n—m,+1 m—m,~1 
Mey 
B 
1 m-r1 
1 


where d,.+++, dy are characteristic numbers, each being repeated a number 
of times equal to its index, and where the solution-multiplicities are 
Mp respectively me+ mp = 

It is now but a step to a solution of the problem of equivalence of 
pairs of bilinear forms. 

THEOREM XIV. Jn order that two pairs of bilinear forms <A, (a, y). 
B, (x,y) and Ag(x,y), Bs(x,y) be equivalent it is necessary and sufficient 
that the 4-polynomials be identical and that the corresponding indexes and 
the corresponding solution-multiplicities be the same. 


7. REDUCTION OF PAIRS OF QUADRATIC AND HERMITIAN FORMS 

It may be well to note some of the modifications of the treatment of 
$$ 1—6 which are necessary in order to specialize our theory for the 
eases of quadratic and hermitian forms. 

In the case of quadratic forms the matrix A —AB is symmetric and the 
two sets of linear equations associated with it and corresponding to (13) 
and (14) are identical. When each of the solution-multiplicities is unity. 
the two sets of solutions (16) and (16’) are identical and the discussion 
proceeds by replacing the y’s by the »’s; but when the solution-multiplicity 


red 


476 R. G. D. RICHARDSON [October 


is greater than unity the solutions of the derived equations are not unique, 
and we have seen fit in some cases to leave an originally chosen set of y’s 
fixed and manipulate the z’s to satisfy the prescribed conditions. However, 
the methods used can be modified without much difficulty to cover the 
case of the quadratic forms. As an example of the only type of modifi- 
cation necessary, let us prove Theorem V for solutions associated with 
quadratic forms. 

Denoting as before by the linearly independent solutions 
originally chosen, it is possible first to show that constants «, 8, y can be 
determined so that 

+ BX +y X (hs) 


is normalized. For, if not, B(1,1) would vanish identically in «, £, 7; 
and this involves 
Bi(k,, ki) = B(k,, ke) = B(k,, ks) = 0, 


which is a contradiction of Theorem IV. 

Let Y*) and XY denote a second and third set which form with XY line- 
arly independent solutions of (13) for this value of 4. If Y“ is not already 
orthogonal to X” it may be made so by adding to it aX¥™. For, since 


B(X®) +a@X®, X®) = 1), 


it may be made zero by a proper choice of «. Similarly X“, if not 
already orthogonal to X®, may be made so by adding to it BX. The 
solutions X¥")+aX®, 8X satisfy the conditions of linear 
independence, for any linear relation between them is also a linear relation 
between X¥®, Y¥%), X“, which were assumed linearly independent. 

Let us proceed now with these two solutions which have been built up 
orthogonal to (1). Any linear combination of them will be orthogonal 
to X¥™ and we can so determine the combination that a normalized solution 
results. For if not, we would have an identical relation in the multipliers 
which gives B(k’, k’) = 0, B(k’, k”) = 0 and from Theorem IV we note 
that this is not compatible with B(k’,1) = 0. Denoting by Y@ this second 
solution and by X® a third solution, linearly independent of X®, X®, we 
can determine «, 8, y so that = X¥%+ is normalized 
and orthogonal to X¥™ and X®. For, the equations involved reduce to 


«B(1,1)+8B(1,2)+yB(1,/) = 0, 
« = 0, 
B(l, 1) +8B(1, 1) = 


| 


1924] EQUIVALENCE OF FORMS 477 


and in solving for «, 8, y the denominator determinant consisting of the 
coefficients on the left may be proved different from zero, in a manner 
analogous to that used with (25). Since B(1,1) = B(2, 2) # 1. 
B(1, 2) = B(2, 1) = 0, solution for y has for numerator 1/y and hence 
y? = c + 0, and from this « and £ are readily found. 

THEOREM XV. Jf the form B(x, x) is definite, each solution-multiplicity 
is unity. 

The A-determinant in this case has real roots only and (13) has real 
solutions. In order that the solution-multiplicity be greater than unity we 
have seen in § 5 that it is necessary and sufficient that B(k,k) = 0, 
where k denotes one of the primary solutions. But this is not possible 
for a real definite quadratic form. 

As a simple example of the comparative elegance of this new theory 
of the reduction of quadratic forms, let us consider the reduction to normal 
type of a single quadratic form A(a,az) in m variables and of rank m. 
In the literature this is done by discussing separately the elimination of 
the n—m superfluous variables and the reduction of the resulting form 
by m distinct steps, each time simplifying the problem by separating out 
one variable. The procedure in each of these steps is different according 
as there does or does not exist a term in the main diagonal different from 
zero. On the other hand, from the new point of view let us take the 
auxiliary form B(x, x)= )>*2? and consider the pencil A—AB. There 
will be roots of the 4-determinant of which m—™m will be zero. The 
linear equations (13) have always real solutions of multiplicity unity and 
the matrix |X|) obtained by solving them represents an orthogonal trans- 
formation which reduces A(z, x) to the form >" 4,2°. 

Let us now elaborate somewhat the developments which concern a pair 
of hermitian forms. These forms may be considered as a generalization 
of the real quadratic forms and perhaps their increasing importance in 
mathematical theory is due in part to the possibility of discussing their 
minimum properties (which we shall do in IJ). 

Let ||A|| and ||B| be the matrices of two linearly independent hermitian 
forms, 

A(z, = > Biz, 2) = >, bij (ay = ai, by; = 5,3), 


of which B is non-singular. For any real 4, A+ 4B is also an hermitian 
form and will take on real values only. The coefficients of 4 in the 
4-determinant will be real, and hence those characteristic numbers which 
are not real will occur in conjugate pairs. From the relation A(k, k) 
= 4, B(k,k) we note that the real hermitian forms must both be zero 
for a complex characteristic number. If B is definite or more generally 


(| 


478 R. G. D. RICHARDSON 


if each solution-multiplicity is unity, all the characteristic numbers must 
therefore be real. If the form B(z, Z) is definite, each solution-multiplicity 
is unity, as may be proved in a manner analogous to Theorem XV. 

The reduction of pairs of hermitian forms to normal hermitian types 
proceeds as in §§ 4, 6 with small modifications of the same general character 
as those indicated above for quadratic forms. The types are not different 
from those obtained by other methods.* Some further comments concerning 
the problem may, however, be in order. 

For any real 4 the solutions of the equations corresponding to (13) will 
be conjugate to those corresponding to (14); where 2 is complex the solution 
of (13) will be conjugate imaginary to those of (14) for 4, and the same 
remarks hold for the derived equation, as may be seen by a study of (30) 
and its conjugate equation. Hence a complete set of solutions such as 
(16) will be matched with a conjugate set such as (16’) but not always 
in the same order. 

We note here, then, that the matrix after the usual multiplication by 
the matrices of primary and derived solutions may not be hermitian. When 
the solution-multiplicities are all unity the corresponding elements of the 
two matrices of solutions 7 and y will be conjugate imaginary and 
the product matrix will be hermitian. But if there are complex 4’s, the 
product matrix will not be hermitian unless the solutions are rearranged 
so as to make the corresponding solutions conjugate imaginary. This may, 
however, be done by interchanging in one of the matrices the two sets 
of solutions corresponding to 2 and 4. ‘To exhibit a special example where 
the product matrix is made hermitian by such a process we write down 
the normal form tor a matrix where there are two conjugate imaginary 
characteristic numbers 2,.2,. each of which has index one and solution- 
multiplicity two: 


0 0 QO 4,—/ 
0 
O 

l 0 0 


‘See Hilton, Linearv Substitutions, p.180. Also Logsdon, Equiralence and reduction 
of pairs of hermitian forms, American Journal of Mathematics. vol. 44. pp. 247-60. 


Brown UNIVERSITY. 
PRovipENce. R. J. 


RELATIVE EXTREMA OF PAIRS OF QUADRATIC 
AND HERMITIAN FORMS* 


BY 


R. G. D. RICHARDSON 


INTRODUCTION 


The problem of relative extrema of pairs of forms, which plays an 
increasingly important rdle in analysis, concerns two quadratic forms 


1,n 
A (2, = > ij Fi, Bis, x) bi Uj Aji. hj; hj; 
ij 
or two hermitian forms 
1,7 
A(z, z) = > ij 2:1, z) = > bi % 25; ay Aj, by = dj. 


Does the form A possess a minimum for those real values of the «’s or 

complex values of the z’s which fulfil the condition B = 1 and perhaps 

in addition certain auxiliary linear conditions; and if so, how is it obtained? 
In the Introduction to the preceding memoir,+ the author has indicated 

the intimate connection between this problem and that of equivalence of 

pairs of forms. Following the usual method of the Lagrange factor as out- 

lined there for the problem 

(1) A(x, x) = Min., Biz, 2) = 1, 

we consider the extrema of the form A+ 4B, which leads on differentiation 

to the set of necessary conditions 


4 
a a; 
J 


A necessary and sufficient condition that a solution of these equations exist 
is that 4 be one of the real roots 4, < --- < dp, called characteristic 


* Presented to the Society, September 7, 1923. 

+A new method in the equivalence of pairs of bilineav forms, these Transactions, 
vol. 26, pp. 451-478. This will be referred to as I. 
479 


= 


480 R. G. D. RICHARDSON [October 


numbers, of the determinant |aj— A,'. Denoting a solution corresponding 
to di: by 

X® = (ai, --+, am) | 


and multiplying the equations (2) by 2, -.-, 7, respectively, and adding, 
we obtain 
(3) A(X®, X®) — a, B(X®, X®) = 0, 


a relation which all solutions must satisfy. It follows from (3) that the 
extremal value of the problem must be sought among the numbers 4; and 
in this case it is 4,. 

In the case of the problem with hermitian forms 


(1’) A(z, z) Min.., Biz, 2) = 1, 
the usual method of differentiating with regard to z; the expression A-+ 2B 
gives 


(2’) > (ay = 0, 


J 


while differentiation with regard to z; gives a similar set, which, being 
the conjugates of (2’), add nothing new. On multiplying equations (2’) 


respectively by 2(,---, 2 and adding, there results 
(3’) A(Z®, Z®) — B(Z™, Z™), 


Returning now to the minimum problem (1) and adding the orthogonal 
condition 


1,” 
B(X®, x) = by we, == @, 


we are led to consider the minimum of the expression 
A(x, B(X®, 


which on differentiation with regard to z; gives the set of non-homogeneous 
linear equations 


(4) (a,—4b,) 2, = «Db, 
J 


J 


—— 
— 


1924] RELATIVE EXTREMA 481 


If B is definite, we shall see (§ 2) that « = 0; hence the solution of the 
extended problem is to be found among the solutions of (2) and the minimal 
value is 4 furnished by X®. Further linear orthogonal conditions may 
be successively added, the formal expression to be minimized in the last 
problem being 


A(x, 7) —AB(x,x)— 2p, B(X®, x) — — 2 x). 


It turns out again that in case B is a definite form (or even more generally), 
the «’s are all zero, and hence the solutions of the extended problems are 
contained among the solutions of (2), being Y®,---, ¥™ respectively and 
giving on account of (3) the minima 4s,---, dn. 

While it is true that in such minimum problems actual extrema exist 
only in the special cases where the forms are real quadratic or hermitian 
and one of them definite (§ 1), the formal work can be carried out, so far 
as the first derivatives (such as (2) and (4) and a corresponding set of 
equations in y) are concerned, in the most general case of bilinear forms 
with complex coefficients. Besides introducing new results we have in 
the previous paper ventured to hope that such a point of view may give 
a unity to the treatment of the problem of equivalence. Let us for 
a moment consider how the discussion of the extrema must be modified 
to include other cases. 

In the irregular cases where B(z,2) is not definite, B(Y™, X¥) may 
vanish and in order to make it unity, the corresponding values of » must 
be infinite. In order that the minimum problem (1) have a meaning even 
formally, it would then be necessary to modify it and to write 


A(x,x) = Min., B(x,x) = 0. 


Further, when B(z, x) is not definite, it may turn out that in the equations 
such as (4) the multiplier » is not zero and that these equations do not 
reduce to (2). The formal extremum in this case is then not a solution 
of the equation (2); we have seen in I (§ 5) that corresponding to this 
phenomenon is the fact on the algebraic side that the equations (2) have 
less than » linearly independent solutions. In this event « may be given 
the value unity (that is, absorbed into the solution X®), and it was 
there shown that the solutions are the same as those of the equations 
obtained from (2) by differentiation with regard to 4, the sets of constants 
XS being regarded as solutions of the original equation. The discussion 
was made to cover further extensions of the problem. If the w’s are not 
all zero, we saw that derived equations such as are suggested by the 


33" 


482 R. G. D. RICHARDSON | October 


formal minimum problem and their solutions played a fundamental role in 
the problems of equivalence. 

The literature of relative extrema of quadratic and hermitian forms 
contains no set of sufficient conditions. In the general theory of extrema 
without auxiliary conditions, the criterion associated with the second 
derivatives asserts* that if the sign of a determinant formed from these 
derivatives and the signs of certain of its minors are all positive for those 
values of the variables which make the first derivatives vanish, a minimum 
is present; if the determinant is negative and the minors are alternately 
positive and negative for these values, a maximum is present; if the 
determinant vanishes, no test is furnished. Unfortunately even in the simplest 
case of relative extrema, a corresponding test fails because of the vanishing 
of this crucial determinant of the second derivatives. This we shall now 
proceed to demonstrate. 

Denoting by A and B two functions of x variables such that B is 
definite we can calculate the second derivatives of the quotient A/B 
tor those values )o, ---, (2%, Which make the first derivatives zero. 
These are 

a2, a? B 
0*(4/B) 0.2; 02 


Yo B _lo 


For the particular case of relative extrema of quadratic or hermitian 
forms, B can be set equal to unity while for those values of the variables 
which are obtained by setting the first derivatives equal to zero we have seen 


that A 2; Where 4; is a zero of the determinant | a;—/bj'. Since 
= 2 ix. 


etc., we find then that the combination of second derivatives to be 
scrutinized is, except for a constant factor 2”, the determinant | aj — Axbj 
which vanishes for all such 4, and hence furnishes no criterion. 

It is possible, however, by a different method of attack to deduce 
sufficient conditions for relative extrema. A new necessary condition is 
first developed in § 3 and this turns out to be incisive enough to complete 
the set of criteria for sufficiency. 


1. EXTREMA OF QUADRATIC FORMS 
In discussing the relative extrema of a pair of quadratic forms, it is 
necessary to confine attention to the case where the coefficients of both 


“Hancock, Theory of Maxima and Minima, pp. 91-92. 


1924] RELATIVE EXTREMA 4283 


are real. Unless the contrary is stated, only non-singular quadratic forms 
will come under consideration, and they will be denoted by Q, (7. Qe(.r. 
instead of A(z,x), B(r,x). The simplest problem arises when one ot 
the forms is definite. Let us then consider first those real values of the 
variables x; which give to the positive definite form Q.(7,.7) the value 1. 
For these limited values the form Q, (2, 2°) can take on limited values only, 
and there will be extrema which can be obtained in the usual way. If 
Q, also is definite, the extrema will have the same sign; but if indefinite. 
the minimum will be negative and the maximum positive. 

If Q, is positive definite and we set it equal to 1, the form @. will have 
for extrema the reciprocals of the extrema obtained in the original problem: 
if it is negative definite and is set equal to —1, the extrema of Q. will 
have the absolute values of these reciprocals. If on the other hand Q, is 
indefinite and it is set equal to 1, the minimum of Q, will be the reciprocal 
of the extremum of the other problem, but the maximum will be infinite, 
while for those values which make Q, == —1, the minimum of Q. will be 
the absolute value of a reciprocal of an extremum of the other problem 
and the maximum will be infinite. 

The ratio Q,/Q. is, of course, the fundamental point for consideration 
and when Q, is indefinite and hence becomes zero for values of the variables 
other than zero, the ratio can be made infinite unless the two forms are 
linearly dependent. When singular forms are allowed to enter into the 
discussion it may be that, for the infinite values of the ’s which are 
permitted, there may be a finite extremum. For example, the singular 
form Q, = (a-+22)* has a minimum zero for those infinite values of the 
variable which make @, = 2#;—z} equal to 1. But in any case a positive 
definite form must have a finite minimum, a negative must have a finite maximum. * 

To get a somewhat different point of view, let us consider the extremes 
as functions of the coefficients of Q,,Q:. If at the beginning both forms 
are positive definite and Q. is set equal to unity, one can continuously 
alter the coefficients of Q. so that it passes over into an indefinite form. 
The maximum of Q, increases without limit, the minimum remains finite. 
When Q, during the transition becomes singular-definite, the maximum of 
Q, is infinite and remains so when the form becomes indefinite. But when 
(2. is indefinite it may also be equated to —1 and a corresponding maximum 
of infinity and a finite minimum for Q, will result. 


*Let Q, be an indefinite form and @, a singular definite form. If @, is positive its 


minimum will be zero and this will occur for values which make GY, = 1 or QY, = —1 or 
for both (e.g., if and (1) = 2? +23; (2) Q, = 2? — 23 
(3) Q, = x? — «22 +2732). . The minimum of Q, for 9, = 1, or for = —1 may, 


however, be greater than zero (e. g. in (2)). 


484 R. G. D. RICHARDSON [October 


If, however, initially the form Q, is indefinite and Q, passes over from 
positive to indefinite, the situation is somewhat different. Consider, for 
example, the forms 


If ¢, approaches zero, Q, has a maximum approaching + ©o and a finite 
minimum, while as ¢; approaches zero, Q, has a minimum approaching — 
and a finite maximum. When é, or ¢, changes sign, Q. may be set equal 
to —1 and new extrema of Q, enter. It is not difficult to see that various 
cases will arise in problems of this type according as Q,+Q, is or is 
not a definite form for some value of k. We shall proceed now to develop 
the general theory. 

THeorREM I. Jf for values of the variables which give to the non-singular 
quadratic form Qs the value c+0 the form Q, is positive (negative) and 
has a minimum (maximum) zero, then Q, is a singular positive (negative) 
definite form. 

For, let Q, be reduced by a real non-singular transformation to the 
form Q, = »,«,z?, where some of the «’s may be positive and some 
negative. Q. will by the same transformation take on a new form. If we 
differentiate the quotient Q,/Q. with regard to each of the variables, and note 
that by hypothesis for the minimum point in question the numerator is 
zero and that these derivatives are all zero and Q, = c, we get the 
equations ¢; 2; = O(4 = 1, ---, wm). Sinee not all the 2; can vanish, it 
follows that at least one of the e’s (say ¢,) is zero. The form Q, is then 
singular. It is possible to show further that Q, is definite, for example 
that none of the e's are negative. In order to do this let it be assumed 
on the contrary that «, —1. Two cases must be considered according 
as the corresponding values of x, in Q and of any other variables which 
occur in Q but not in Q, are finite or infinite. When this is finite, we 
note that it is possible to chose it slightly larger, thus making Q, negative 
and giving to Q the value «+e which has the same sign asc. By 
multiplying all the variables by V c/(c-+) the numerator remains negative 
and the denominator is ¢, which is a contradiction of the hypothesis. 

If the case of infinite values is considered and x, is the only variable 
missing from Q,, it cannot occur in Q to a power higher than the first. 
For since the other values of the z’s are finite, a square term would be 
an infinity of higher order than the others and would make Q, infinite. 
(2 may then be written 2, L (az, ---, an) + Q’ (x2, an) where and Q’ 
are respectively linear and quadratic forms in one less variable. But now 
a, +++, 2, Can be assigned in Q, and @ making the former negative 


1924] RELATIVE EXTREMA 485 


while x, can then be assigned to make Q. = c, which contradicts the 
hypothesis. 

Again if there were variables in Q, other than z, which did not occur 
in the numerator and which were infinite for the minimum, the quadratic 
form composed of these terms alone must be indefinite, for otherwise Q, 
would be infinite. These variables may then be adjusted to make Q. = ¢ 
after the numerator is made negative, which leads to the same contradiction 
as above. 

THEOREM II. Jf for values of the variables which give to the indefinite 
form Qe: the value c +0 the indefinite form Q, has a minimum (maximum) k, 
it may be written 


Kk 


=— Q.+P (2, 


where P is a positive singular form. 

This follows immediately from the preceding theorem since the minimum 
of Q, —(k/c) Q. under the conditions imposed is zero. 

COROLLARY. When Q, and Qs are indefinite but there exist constants kz 
such that k,Q,-+k, Q. is singular definite, the ratio Q,/Qs takes on the 
values 0 and + but there may be a single continuous range of values which 
it does not take on. 

For, on writing 

+1= 

kz Qs kz Qe 
where P is a singular definite positive form, we have (cf. footnote p. 483) 
that the expression on the right may for positive k,Q, take on values 
from a positive constant (or zero) to + while for negative k, Q. it takes 
on values from negative constant (or zero) to —c. If both ranges include 
zero the ratio can be any value whatever; but if not, there is a gap in 
the values taken on by k, Q,/k2Q. and hence by Q,/Q2. 


2. RELATIVE MINIMA OF QUADRATIC AND HERMITIAN FORMS 
In taking up the formal analysis of the actual problems of relative 
extrema, we note from the discussion of the preceding section that for 
quadratic forms nothing is lost in generality by assuming that the form B 
is positive definite and has the value unity (B(a2,7)—1). By means of 
a real substitution the pair of quadratic forms can be reduced in this case to 


486 R. G. D. RICHARDSON [October 


The minimum 2, of A for those values which make B(az, «) = 1 is thus 
given by = 1, = --- = = and the maximum 4, by = 
oes ce) , = 0, 2 = 1. If in addition to the quadratic condition it 
is demanded that the solution be orthogonal to the minimum already found, 
B(X™, x) 0, it is necessary that 2, — 0 and the minimum /, is 
furnished by = 1, = = = 0. And in general, 
the solution = = 2 = = 0, 2 = 1 gives 
i», as the minimum of A under the quadratic contition B = 1 and the 
linear conditions of orthogonality, B(X™, 2) = 0, when & ranges over 
1, ---, m—1, and as the maximum under the pees and the same 


linear condition when / ranges over m-+-1, ---, 

These results may be immediately translated into terms of the original 
forms A and B. 

THEOREM III. Jf a) and B(x, x) are two real non-singular qua- 
dratic forms of which B is positive definite and if 4, <4, <---<4n are 
the n characteristic numbers of the 4-determinant | a;j—A4b;ij|, then the 
minimum of A(x, x) for which B(x, x) = 1 is 4, furnished by X™ and 
the maximum is dy furnished by X™; and in general X™ furnishes the 
minimum dm under the added conditions B(X™, x) = 0,---, B(X¥™—», x) =0 
and the maximum hm under the added conditions B(X™*», x) = 0, 
B(X™, x) 0. 

This theorem may also be derived directly and the actual existence of 
the extrema shown by expansion* in terms of constants in somewhat the 


* Since any set of x constants C,,---,C,, can be expressed linearly in terms of any x 
given linearly independent sets of » constants each, either set of normalized solutions in 
form like (16) or (16) of I will serve as a basis for such expansion. To calculate the 
coefficients let us denote them by c,,---,¢, and write 

— 

When » corresponds to a solution of multiplicity unity and these equations are multiplied 
respectively by ; and = result summed over j, the first two theorems 
of I determine the coefficient as 3,;C; bj; yi = c;,. If on the other hand » corresponds 
to a multiple solution and is one of the which correspond 
successively to a primary and its corresponding solution of the derived equations, it is 
necessary to multiply by 

J 
and in that case from the theorems of I (§ 5) we have, by a similar process, 


Sn (24+m,—k+1) __ 


If the y’s are the same as the z’s, the results are correspondingly modified. 


1924| RELATIVE EXTREMA 487 


same fashion as was done in an article in these Transactions.* But 
the process ordinarily used in analysis is more closely allied to the methods 
indicated in the Introduction and we shall proceed in that direction. We 
propose therefore to treat in detail by the ordinary methods of relative 
maxima and minima some one problem, and have chosen for this purpose 
hermitian forms, which are more general than the real quadratic. 

In order that a hermitian form A(z, 2) be definite it is necessary and 
sufficient that the quadratic form in the z’s and y’s into which it may 
be resolved by the substitution z — x» + cy be definite also. In discussing 
the relative extrema we shall then assume that B(z, 2) is positive definite. 
Since the problem of maxima is treated in the same fashion, only the 
minima will be discussed. The simplest problem may be set up as follows: 


(5) A(z, 2) = Min., Biz, z)—1 = 0. 


Proceeding in the manner usual in the analogous problem for real variables 
where the variables and coefficients are real, let us consider the expression 


(6) A(z, z)—A[B(z, z) —1] = Min. 


When 2 is real, this real hermitian form can be derived with regard to 

each w and each y, the 2” partial derivatives being real expressions to 

be equated to zero to furnish the minimum. It is, however, not necessary 

to resort to that device. If we derive with regard to z there result x 

expressions to be equated to zero, deriving with regard to 2; giving expressions 

conjugate to the first set which on equating to zero give nothing new. 
From the relations 


2i 22 


Since the expressions 0H/dx; and 0H/dy; are real, equating to zero the 
derivatives of H with regard to Zz; gives an equation equivalent to the two 


* A new method in boundary problems for differential equations, vol. 18, pp. 495-496. 


and 
0H 0H 62; 0H 0H 0H 0H 0 Yi 
0 2; 02; Oyi 02; 0 2; Of; Oy; O02; 
it follows that 
0H 16H 1 0H 0H 10H , 1 6H 
2 Ox; 2 ayi 2 | 2 dy; 


488 R. G. D. RICHARDSON [October 


obtained by equating to zero the derivatives 6 H/dz; and 0 H/dy;. In other 
words, each of the » equations 


(7) > (aj— bi) 2; = (¢ = tee, n) 


J 


is equivalent to the 2 linear homogeneous equations obtained by deriving 
with respect to 2; and y;. The set (7) contains » equations in » complex 
variables and the others form a set of 2 equations in 2” real variables. 

By a theorem analogous to I (XV) each solution-multiplicity of (7) is unity 
and hence the solutions for the various 4’s can be written down as linearly 
independent. The solution for the conjugate equations can likewise be 
written down, each being conjugate to one of the other set. One of these 
solutions z, 2 when normalized to satisfy the relation B(z, 2) = 1 (which 
may also be considered as obtained from (5) by differentiation with regard 
to 2), must furnish the minimum for A(z, Zz). Since these give to A the 
values (3’) 4, <--- <4, the minimum must be furnished by Z, Z® which 
gives to A the value 4,; more specific discussion is reserved for the next 
section. 

To determine the minimum under the added orthogonal condition 


(8) B(Z™, = (or the conjugate B(z,Z) 0), 
we set up as the expression to be minimized 
A(z, 2) —A[B(z, 2) —1] —2p[B(Z, z)], 


and differentiate with regard to the z’s, 4 and «, obtaining »-+1 linear 
homogeneous equations in the z’s and p, 


ye 


>> 
(9) ( ai; 


together with (8) and the quadratic relation of (5). Each of the derived 
equations (9) in z is equivalent to two equations, both containing z’s 
and y's; and it is easily shown that these two are the derived equations 
of those which would be set up from one of the equations (7) by separation 
into real and imaginary parts. In order that these linear equations in 
z; and w have a solution, the necessary and sufficient condition is the 
vanishing of the determinant formed by bordering | aj—4by| with one 
row and one column only (cf. (10)). This polynomial in 4, which is real 


1924] RELATIVE EXTREMA 489 


because the terms symmetrical with the main diagonal are conjugate, is 
of the (7 —1)th degree and we can readily show that the roots are 
Az, +++, 4n. For, if we set 4 = 4;(i = 2,---,) and multiply the first n 
rows respectively by 2\,---, 2 and add, we have zero for each column. 
This is true for the first columns because the solution Z” is so defined, 
while for the last ((2-+1)th) column it follows from the orthogonality 
relation (8). 

To show that mw is zero we multiply the equations (9) by 2\?,---. 2 
respectively and add, giving for the first m columns 


(ayy — by) 2 (yj = I1,---, #), 
t 


the coefficients of z; being conjugates of the left hand sides of (7) and 
hence zero. For the other part we have on addition #B(2z™, 2), which 
must then be zero from the equations (9); and from (5) it follows that 
= 0. Equations (9) reduce to (7) but since for solutions of such equations 
we have, from (3’), A(z, 2) = 4;B(z, 2), the minimum process picks 
out the solution Z@, Z® which when normalized to satisfy (5) furnishes 
to A the value A,. 

Proceeding in the same manner we can show successive minima to exist 
and that they are furnished by solutions of the equations (7) for 2 = dg;,---. 
The expression to be used in the last case when the additional orthogonal 


conditions 
> biz? 2; = U0, > z;= 0 
are imposed is 
A(z, 2z)—AB(z, 2)—2y, B(Z®, Z)—---— B(Z™”, 


and the determinant of the linear equations is 


j 
j j n—1) 
| Oy ‘in 0 
| i 
| 
2% > bin?! 0 0 


| 

| 


490 R. G. D. RICHARDSON | October 


The polynomial in 4 is real and of the first degree and since on setting 
2 =4, and multiplying the first rows by 27%, ---, respectively and 
adding, the sum of each column is zero, it follows that this polynomial 
has 4, for zero. To show that #; — 0, we multiply the first » linear 
equations corresponding to the matrix (10) by 7,---, 2 and on addition 
note that the coefficients of the z; in the first x columns are the conjugates 
of the left hand sides of (7) and hence are by hypothesis zero for i= 1,---,. 
On account of the orthogonal conditions imposed, the sums of all the other 
columns so obtained will also be zero, except that the (m-+7)th gives 
wi B(z2, Z) and since this must also be zero, it follows that w#; = 0. 
The first » equations reduce then to (7) and, by (3’), it follows that A 
has a minimum (and only) value 4,. 

When neither of the hermitian forms is definite some or all of the 
characteristic numbers may be complex. The complex 2’s occur in con- 
jugate imaginary pairs and the corresponding solutions will be conjugate 
imaginary. In this case by proceeding formally and imposing the orthogonal 
conditions which naturally arise the problem may eventually be reduced 
to one which has real significance.* 

As a typical example where both forms are indefinite and where solutions 
of the derived equations arise in the formal solution of the minimum problem, 
let us consider the problem of I (§ 6) connected with the matrix (33) and 
in order that it may have a meaning as a minimum problem let us con- 
sider the form as quadratic and not bilinear.+ If the coefficients of A and B 
are real the 4’s here will be real but in similar problems some might well 
be complex, with another set conjugate to them. 

For this problem the linear equations obtained by differentiating with 
regard to the z’s the expression 4[B(x,x)—c] have solutions 
which make B(x, x) = 0, so that in order to satisfy the quadratic equation 
also we must modify by setting « = 0 before proceeding. The next step 
formally is to minimize 


A(a,x)—4|B(a, x) — x) — ec], 


where X“ is the formal solution of the modified first problem. This gives 
rise to the linear equations which we have called derived equations (the 


*For example, if 


A = 24,2, +223 — 22, = 2%, 


the 4’s are i, —i, 1, —2, the formal orthogonal conditions corresponding to i and —i are 
x, =0, 2, =0 and since after these conditions are imposed, B is definite, the resulting 
problem may be solved in the usual manner. 

7A part of the formal work can be carried out even with bilinear forms by differ- 
entiating both with regard to a’s and y’s. 


4 


1924] RELATIVE EXTREMA 491 


multiplier « can be absorbed into the solution Y®). In a manner similar to 
that used above with hermitian forms it can be shown that the polynomial 
in 4 resulting from the determinant formed from the four linear equations 
in 71,722,273, w is of the second degree and that 4, is a (double) root. The 
solution Y®@ of the derived equations can be made to satisfy the quadratic 
condition B(Y®, 1 while B(X¥®, X¥®) is zero no matter what 
solution is chosen for Y¥®. Since B(Y®, X¥®) =O, the argument that 
# = 0, given in connection with the equations (9), is no longer valid; in 
fact « must be different from zero. For, since if « = 0, equations (9) 
reduce to (7), which here have by hypothesis one solution only, it would 
follow that Y¥® — Y® and this would lead nowhere. 
Proceeding to the final stage we set up the expression 


x2) x7) — 7.) —2 | x) — es] 


and derive with regard to 2,.72,273,m@,v,4. The five linear equations 
furnish a determinant which on equating to zero gives a polynomial of first 
degree for 2. It may be shown that 2 = 4, is the root and X® a solution. 
By a slight extension of the methods in I (§ 6) it may be shown that 
there are in these linear equations enough arbitrary constants and solutions 
capable of adjustment to make B(X), ¥) = 0, B(X°, X®) = 0 while 
B(X®, X®) = B(X®, 


3. A NEW NECESSARY CONDITION FOR RELATIVE EXTREMA IN QUADRATIC AND 
HERMITIAN FORMS, SUFFICIENT CONDITIONS 


In determining in § 2 the minimum of the hermitian form A(z, z) under 
the quadratic condition that the positive definite hermitian form B(z, z) 
be unity and under the m linear conditions 
(11) B(Z@,z)=0, B(Z™,z) = 0, 
we were led to the consideration of a determinant D(/) made up of the 
determinant | 4;—4b; | bordered by m rows and columns (cf. (10) or (13)). 
This determinant is of degree »—m in 4 and, as the author has shown 
in another paper*, the zeros of such a polynomial are all real. In 
$ 2, it was shown that these zeros are actually 4mii,--+, 4n, Where 
A, 4g < +++ are the zeros of the unbordered determinant | aj— Ab; |. 
For each of these n—m values 4, the linear equations corresponding to 


* On the reality of the zeros of a A-determinant, Bulletin of the American Mathe- 
matical Society, vol. 29, p. 467. See also Hilton, Linear Substitutions, for a somewhat 
less general theorem. 


| 


494 R. G. D. RICHARDSON 


When no linear conditions (11) are imposed, the o; represent sums of 
principal minors of the determinant | aj— 2, by}. 
It is not difficult to write down from considerations of symmetry the 
corresponding theorem for maxima. 
Brown UNIVERSITY, 
PROVIDENCE, R. I. 


ERRATA, VOLUME 25 


J. F. Rrrv, Permutable rational functions. 
Page 399, second line from bottom (footnote), for “exists” read “exist.” 
Page 402, line 21, for “see the” read “see how the.” 

NORBERT WIENER, Discontinuous boundary conditions and the Dirichlet problem. 
Page 313, line 14, the exponent of (PQ) should be 2—n,. not n—2. 
Page 314, line 1, same correction. 
Page 314. line 3, for “ca"-*” read “ca*-"™ 


